Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
medRxiv (Medicine) 2026-06-17

Long-term mortality and cause-specific death after non-cardiac chest pain: a multicentre cohort study of 160,245 patients in China

Abstract Background Non-cardiac chest pain (NCCP) is commonly regarded as a low-risk condition. However, long-term mortality, cause-specific death, and high-risk subgroup characteristics remain poorly defined. Methods In this multicentre registry-linked cohort study, we linked the Chest Pain Center Registry from 101 hospitals in Hunan, China, with the Mortality and Cause of Death Registry. Adults diagnosed with NCCP from Jan 1, 2017, to Dec 31, 2021, were included. We assessed 3-year all-cause, cardiovascular, and non-cardiovascular mortality using Cox, restricted cubic spline, and Fine-Gray models. Findings Among 160,245 patients, 4674 deaths occurred within 3 years (2.9%). Mortality increased sharply after 60.5 years. Age [≥] 60.5 years (adjusted hazard ratio [aHR] 7.49 [95% CI 6.89-8.14]), rural residence (time-varying aHR 1.46 [1.35-1.57] in year 1 and 1.66 [1.46-1.89] in years 1-3), and male sex (aHR 1.47 [1.38-1.57]) independently predicted death. Three-year mortality ranged from 0.3% in younger urban women to 8.4% in older rural men. Cardiovascular diseases accounted for 56.4% of deaths among older patients, whereas other non-cardiovascular causes (22.8%) and malignancy (20.8%) were the largest categories among younger decedents. Interpretation NCCP is not uniformly benign. Age, rural residence, and sex identify patients who could benefit from risk-stratified follow-up, with cardiovascular prevention prioritised for older rural men and broader non-cardiovascular assessment considered for younger patients.

02.
arXiv (CS.LG) 2026-06-12

Disparate Impact in Synthetic Data Generation

arXiv:2606.13105v1 Announce Type: new Abstract: We revisit the fairness notion of disparate impact for synthetic data generation (SDG), that assesses whether the utility of generated records is the same across sensitive groups. Our approach departs from existing work on fair SDG, that address the problem of correcting for undue biases in the observed distribution, hence redefining SDG as learning a distribution that is not that of the real data. By contrast, non-disparate impact is notably achieved when the synthetic and real distributions are the same. We expose reasons why SDG may fail to reach that solution and discuss why approximation and estimation errors occur and can be disparate across groups. We notably look into the expressive power of SDG methods relative to distribution complexity, sampling errors due to group proportions, and estimation errors induced by differential privacy mechanisms. We illustrate cases of disparate impact on both artificial and real-world data, focusing on SDG methods that rely on probabilistic graphical models. We also introduce a strategy of learning group-wise SDG models and illustrate how it can improve both the overall utility and its parity in many settings.

03.
medRxiv (Medicine) 2026-06-16

Reliability and construct validity of the Technology Device Interference Scale in a sample of children and parents

There is increasing interest in parent-child technoference: the interference with personal interactions caused by technology devices. This study examined the reliability and construct validity of the Technology Device Interference Scale (TDIS) to measure technoference in a sample of Canadian parents and children. Parents (n=883) and children (n=376) were recruited from clinical and community settings and completed the TDIS for their own and family member technoference over three timepoints (T1=2023, T2=2024, T3=2025). TDIS internal consistency, test-retest reliability, and construct validity were assessed using Cronbachs alpha, intraclass correlation coefficient, and confirmatory factor analysis, respectively. The TDIS showed good internal consistency and adequate to good construct validity when used by children to report on their own technoference (all >.70; CFI>.95, TLI>.95, RMSEA.70; CFI>.95, TLI>.90, RMSEA[≤].11). The TDIS had low to acceptable internal consistency and poor model fit for parent report of their own technoference ( range: .63 - .66; CFI

04.
arXiv (CS.CV) 2026-06-18

SP-TransientBench: A Real-Captured Single Photon Perception Benchmark

Single-photon LiDAR (SPL) based on single-photon avalanche diode (SPAD) sensing enables time-resolved photon measurements with extreme sensitivity, offering unique potential for active 3D perception in photon-starved scenarios.However, real-world single photon perception remains fundamentally challenging due to unique measurement noise and complex multi-return transient phenomena, which jointly complicate geometric reconstruction and semantic scene understanding. Despite growing interest in SPAD-based sensing, existing studies are largely limited to simulated data or small-scale controlled captures. As a result, systematic evaluation of real-world single photon perception across depth estimation, multi-view reconstruction, and 3D semantic understanding remains underexplored. To bridge this gap, we introduce SP-TransientBench (STB), a real-captured multi-task benchmark for single photon perception. SP-TransientBenc comprises 10 diverse scenes and 10,297 views captured using a solid-state single-photon LiDAR at $256\times192$ resolution. Each view provides full time-of-flight histograms with multi-return behavior,standardized metadata, and calibrated camera poses for multi-view evaluation. We further provide 13-class 3D semantic annotations for selected scenes. By providing dedicated data splits and evaluation protocols for each task, STB enables consistent and reproducible benchmarking of real-world single photon perception across multiple 3D vision problems. The dataset and code will be released upon acceptance.

05.
arXiv (quant-ph) 2026-06-11

Non-Hermitian Delocalization Realizes Random Dirac Criticality in One Dimension

arXiv:2606.12089v1 Announce Type: cross Abstract: Non-Hermitian systems can evade Anderson localization and exhibit delocalized states even in one dimension. Here, we show that such non-Hermitian delocalized states under periodic boundary conditions (PBC) are intrinsically critical, realizing the universality class of one-dimensional random Dirac fermions. By linking spectral winding to topological Anderson transitions via Hermitization, we demonstrate that the delocalized PBC states exhibit a Dirac-type criticality with universal algebraic correlations. In contrast to Hermitian systems, where this criticality occurs only at fine-tuned transition points, it emerges generically in non-Hermitian systems as a consequence of spectral topology. These results identify a universal mechanism by which non-Hermiticity promotes criticality, providing a unified description of non-Hermitian delocalization in one dimension.

06.
arXiv (CS.LG) 2026-06-15

Curvature-Guided Geometric Representation for Protein-Ligand Binding Affinity Prediction

arXiv:2606.14159v1 Announce Type: new Abstract: Protein-ligand binding affinity (PLA) prediction is critical in drug discovery. Despite the notable advancements in machine learning-based approaches, existing methods struggle to jointly characterize local geometric organization and globally coordinated cross-molecular interactions, limiting their ability to model complex binding mechanisms. Here, we propose RicciBind, a geometric representation framework that integrates curvature-guided hierarchical structure learning with optimal transport (OT)-based cross-domain alignment to model molecular interactions. Specifically, RicciBind leverages Ricci curvature to capture local interaction tightness within molecular structures, enhancing structural awareness and organizing atomic interactions into curvature-aware hierarchical representations. An OT-based cluster matching mechanism then aligns protein and ligand clusters across heterogeneous domains under geometric constraints, enabling globally consistent correspondences and revealing higher-order interaction patterns beyond local neighborhoods. By coupling curvature-guided structure encoding with OT-driven cross-domain alignment, RicciBind effectively models complex interaction semantics and substantially improves both the accuracy and interpretability of binding affinity prediction. Extensive experiments demonstrate that RicciBind achieved superior predictive performance and generalization across PLA benchmarks and virtual screening tasks. Ablation studies further confirmed the essential role of Ricci curvature in enhancing molecular interaction representations.

07.
arXiv (CS.AI) 2026-06-18

SAERec: Constructing Fine-grained Interpretable Intents Priors via Sparse Autoencoders for Recommendation

arXiv:2606.18897v1 Announce Type: cross Abstract: Intent-based recommender systems have gained significant attention for improving accuracy and interpretability by modeling the underlying motivations behind user behaviors. Most existing models derive intents directly from user sequences via clustering or prototype learning. However, they are sensitive to sequence quality, require presetting the number of intents, and lack explicit semantic grounding. These issues lead to an incomplete and coarse intent set and limit the effectiveness of recommendation. In this paper, we propose the Sparse Autoencoder for intent-based recommendation (SAERec), a novel recommender that automatically constructs a fine-grained and interpretable intent space from a textual corpus to guide recommendation. Rather than treating texts as side signals, SAERec leverages them as high information density evidence for intent construction. Specifically, we first extract a comprehensive set of fine-grained interpretable intents from the latent space of large language models (LLMs) by using a sparse autoencoder (SAE) to disentangle and interpret text embeddings, which isolates intent-related semantics from textual noise. Then, for each user, we retrieve relevant intents from this set as priors to guide recommendation. It contains personal intents matching a user's current interests and public intents capturing general item patterns shared across users (e.g., quality, price). Finally, to integrate retrieved intents into sequence modeling, we propose a multi-branch attention mechanism that captures temporal dependencies and injects both personal and public intent signals, followed by an adaptive fusion layer to construct the final user representation for recommendation. Extensive experiments on public datasets demonstrate the superiority of SAERec, consistently outperforming state-of-the-art baselines while providing human-understandable explanations.

08.
arXiv (CS.AI) 2026-06-16

DeepRoot: A KG-Coordinated Multi-Agent System for Therapeutic Reasoning over Historical Medical Texts

arXiv:2606.15931v1 Announce Type: cross Abstract: Historical medical archives and traditional medicines hold immense potential for drug discovery and remain a primary source for current drug development. However, pre-ontological prose and idiosyncratic taxonomies prevent the standardization and medical modernization of the data for use in current biomedical pipelines. Furthermore, no existing LLM agent system, whether tool-calling, retrieval-augmented, or agentic deep-research, can convert such text into verifiable drug-discovery leads at scale. We close this gap with DeepRoot, a multi-agent LLM system that jointly builds and utilizes a verified knowledge graph, showing that grounding and reasoning – often conflated – are separable axes the system can compose for therapeutic reasoning. Applied to the Shen Nong Ben Cao Jing, DeepRoot recovers $10$ of $21$ held-out compound-disease treatment pairs at R@$20$ ($47.6\%$ vs $4.8\%$ for a raw corpus LLM and $\sim\!2.4\%$ random) and dominates an LLM-as-judge audit for reasoning quality over baseline LLMs and LLMs with direct tool-call access to the same APIs DeepRoot itself queries. Tool-using LLMs hallucinate evidence on $87\%$ of claims, versus 7-10% for DeepRoot. Graph-only inference hallucinates $0\%$ but ranks lowest on reasoning coherence; DeepRoot KG+LLM is the only condition to win on both axes, pointing toward a route for systematic mining and repurposing of historical medical knowledge.

09.
arXiv (CS.CL) 2026-06-16

Formalize Once, Edit the Rest: Efficient Lean-Based Answer Selection for Math Reasoning

With large language models (LLMs) increasingly applied to mathematical reasoning, formal proof assistants such as Lean can be leveraged to verify reasoning outputs with machine-checkable rigor, enabling use cases such as answer selection in test-time scaling with K sampled candidate answers. However, employing Lean requires that LLM outputs, originally in natural language, first be formalized. Existing Lean-based answer-selection work uses an autoformalization model to generate a formal statement in Lean for each candidate answer independently, incurring a significant computational cost. We propose BASE, a base-and-edit pipeline that formalizes a single base candidate per problem and derives the remaining K-1 statements by editing the answer expression in place. To facilitate this, we train a rewriter model LEANSCRIBE to localize the answer in the base formalization and generate a reusable edit function for the other K-1 candidates. BASE simultaneously improves selection accuracy and reduces formalization cost - a Pareto improvement that holds on all 12 (dataset, solver) configurations across four benchmarks and three solvers, cutting autoformalizer calls by about 5x at K=8, with the reduction expected to become larger as K grows. Code is available at https://github.com/ucr-rai/base-and-edit.

10.
PLOS Medicine 2026-05-22

Differences in tuberculosis prevalence by sex in low- and middle-income countries over 1993–2025: A systematic review and meta-analysis

by Nicole A. Swartwood, Nanki Singh, Seyed Alireza Mortazavi, Melike Hazal Can, Hening Cui, Do Kyung Ryuk, Peter MacPherson, Katherine C. Horton, Nicolas A. Menzies Background Global and national initiatives to combat tuberculosis (TB) have expanded over recent years. Despite this, the TB burden remains high in some population groups, with men recognized as having elevated TB risks. Summary measures of sex differences in TB prevalence were last estimated in 2016. Since then, many additional prevalence surveys have been conducted, including in the highest TB burden countries. We conducted a systematic review of sex-stratified TB prevalence survey data published over 1993–2025, to provide updated estimates of male-to-female (M:F) TB prevalence ratios and determine whether sex-related disparities in TB burden have closed over time. Methods and findings We identified surveys reporting community-representative, sex-stratified estimates of pulmonary TB prevalence in low- and middle-income countries (LMICs), including surveys from an earlier review (covering January 1993–March 2016) and a new systematic review (covering 1st December 2015–13th October 2025). This review was prospectively registered with PROSPERO (CRD42024503853) and included searches of PubMed, Embase, Global Health, the Cochrane Library, Africa Index Medicus, LILACS, and SciELO. We extracted data on bacteriologically confirmed and smear-positive TB prevalence among adults (aged ≥ 15 years), stratified by sex. Risk of bias was evaluated using eight criteria specific to prevalence surveys. We fit multi-level Bayesian regression models with study- and country-level random effects to estimate the M:F ratio of TB prevalence (male prevalence divided by female prevalence), overall and for key subgroups. In meta-regression analyses, we estimated how prevalence ratios varied over time and according to known TB risk factors and TB case definitions.We identified 10,124 publications and extracted data from 100 eligible studies representing 102 unique prevalence surveys and 4,658,310 participants (45.6% male) in 33 LMICs. TB prevalence was higher in men than women in 90/102 of the included surveys, with a pooled M:F prevalence ratio of 2.02 (95% credible interval (CrI): 1.71, 2.34) for bacteriologically confirmed TB and 2.38 (95% CrI: 1.91, 2.90) for smear-positive TB. Time trend analyses showed a 2.0% (95% CrI: −0.2, 4.5%) average annual change in the M:F ratio of bacteriologically confirmed TB over the study period. The M:F prevalence ratio was estimated to be higher for countries with greater excess HIV prevalence among men, and countries with greater gender equity (as measured by the United Nation’s Gender Development Index). The estimated M:F prevalence ratio was also higher for surveys that did not restrict testing to individuals reporting TB symptoms. Study limitations include heterogeneity in survey methods and definitions, as well as limited data from the Americas, Eastern Mediterranean, and Europe WHO world regions and post-COVID-19 period. Conclusions Men in LMICs consistently experience TB at a higher prevalence than women. Time trend estimates are uncertain, but consistent with widening sex differences in TB prevalence over the last three decades, despite efforts to address the risk factors underlying this excess TB burden.

11.
arXiv (CS.LG) 2026-06-17

Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned

arXiv:2603.25937v2 Announce Type: replace-cross Abstract: Visual Navigation Models (VNMs) promise generalizable, robot navigation by learning from large-scale visual demonstrations. Despite growing real-world deployment, existing evaluations rely almost exclusively on success rate, whether the robot reaches its goal, which conceals trajectory quality, collision behavior, and robustness to environmental change. We present a real-world evaluation of five state-of-the-art VNMs (GNM, ViNT, NoMaD, NaviBridger, and CrossFormer) across two robot platforms and five environments spanning indoor and outdoor settings. Beyond success rate, we combine path-based metrics with vision-based goal-recognition scores and assess robustness through controlled image perturbations (motion blur, sunflare). Our analysis uncovers three systematic limitations: (a) even architecturally sophisticated diffusion and transformer-based models exhibit frequent collisions, indicating limited geometric understanding; (b) models fail to discriminate between different locations that are perceptually similar, however some semantics differences are present, causing goal prediction errors in repetitive environments; and (c) performance degrades under distribution shift. We will publicly release our evaluation codebase and dataset to facilitate reproducible benchmarking of VNMs.

12.
arXiv (CS.LG) 2026-06-11

Fixed-Parameter Tractability of Private Synthetic Data Generation

arXiv:2606.11283v1 Announce Type: cross Abstract: We study the problem of generating synthetic data under differential privacy. We establish fixed-parameter tractability (FPT) for this problem where the parameter is the treewidth of the query family's incidence graph. Our algorithms attain optimal error rates across all regimes and are realized by two different approaches: the first is based on linear programming (LP) and the FPT of the separation problem for the LP dual; the second is based on a subsampled private multiplicative weights method, where we obtain FPT for sampling from Gibbs distributions. Both approaches are unified by a dynamic programming framework over a tree decomposition.

13.
arXiv (CS.AI) 2026-06-11

SVoT: State-aware Visualization-of-Thought for Spatial Reasoning via Reinforcement Learning

arXiv:2606.11770v1 Announce Type: new Abstract: Spatial reasoning remains a challenge for Multimodal Large Language Models (MLLMs), as it requires reliable multi-hop inference over both intermediate states and state transitions. Current studies often leave intermediate states unverified and treat state transitions as implicit processes, which limits reliability in multi-hop spatial reasoning. To address this, we propose State-aware Visualization-of-Thought (SVoT), a reinforcement learning framework that generates interleaved, verifiable intermediate states and visualizations. SVoT integrates transition reasoning chains into the generation processes, enabling the model to verify action preconditions and effects through interleaved textual and visual reasoning. We train SVoT via Group Relative Policy Optimization (GRPO), instantiating verification through reward design and evaluating the efficacy of different fine-grained rewards. As existing benchmarks reduce state transitions to single-variable updates, substantially simplifying the problems, we establish five domains by extending classical environments and introducing two novel domains, Pacman and Gather, that require multi-object interactions and numerical reasoning. These domains support systematic evaluation of multi-hop spatial reasoning with quantitative verification of generated intermediate states and transition reasoning. SVoT with transition-aware supervision achieves state-of-the-art performance across the introduced domains, yielding up to a 65% absolute accuracy gain on out-of-distribution test sets.

14.
arXiv (CS.AI) 2026-06-11

Towards a Bridge Layer Between Bibliographic and Formalized Mathematical Knowledge

作者:

arXiv:2606.11430v1 Announce Type: cross Abstract: Mathematical knowledge is split between bibliographic databases (e.g., MathSciNet, zbMATH Open) and formal proof libraries (e.g., Lean mathlib), preventing unified access between published results and their formalizations. We propose a relational bridge-database that aligns publication metadata with formal artifacts, providing an interoperability layer between mathematical literature and machine-verifiable proofs. We introduce a paper-level formalization score that measures how much of a publication is covered in formal systems. As a feasibility study, we show how such scores can be estimated via cross-document alignment between informal texts and Lean formalizations, enabling large-scale analysis of formalization coverage. This framework is a first step toward integrating bibliographic and formal mathematical ecosystems into scalable, machine-actionable knowledge graphs linking publications to formal proof objects.

15.
arXiv (CS.CL) 2026-06-19

MENTOR: Reinforcement Learning via Flexible Teacher-Optimized Rewards for Tool-Use Distillation

Distilling the tool-use capabilities of large language models (LLMs) into small language models (SLMs) is essential for their practical application. The predominant approach, supervised fine-tuning (SFT), suffers from poor out-of-domain (OOD) generalization due to its rigid alignment with static teacher trajectories. While reinforcement learning (RL) offers an alternative, the capacity limitations of SLMs pose a severe dilemma: sparse outcome rewards provide insufficient guidance, whereas strict trajectory matching imposes overly restrictive constraints. To bridge this capacity-driven gap, we propose MENTOR, which introduces a flexible yet process-aware reward structure. Instead of enforcing rigid replication, MENTOR uses the teacher's reference to guide tool-use behavior, balancing behavioral alignment with downstream performance. Extensive experiments on controlled executable-tool benchmarks demonstrate that MENTOR improves OOD tool-use performance compared to SFT and strict RL baselines. Our findings suggest that within verifiable tool-use environments, flexible tool-use alignment offers a more effective approach than strict trajectory replication for developing adaptable small models.

16.
arXiv (quant-ph) 2026-06-16

Non-Markovianity-based ultrasensitive parameter estimation

arXiv:2211.05142v2 Announce Type: replace Abstract: Accurate parameter estimation is a central task in quantum metrology and sensing, where quantum resources can provide precision beyond classical limits. In realistic settings, however, system-environment interactions lead to decoherence, reducing these strategies to their classical counterparts. Noise is typically classified as Markovian or non-Markovian, with the latter often preserving quantum coherence longer and thus supporting better metrological performance. Still, the absence of noise is generally considered ideal. In this work, we uncover a striking reversal: certain non-Markovian environments not only outperform Markovian ones - including their quantum Cramér-Rao bounds - but can also surpass the entirely noiseless case. We demonstrate these findings numerically for an all-optical setup, which is experimentally feasible and can be extended to other physical platforms. In general, our results open new avenues for noise-assisted quantum metrology beyond conventional limits.

17.
arXiv (quant-ph) 2026-06-15

Efimov Effect in Ultracold Microwave-Shielded Polar Molecules

arXiv:2602.21433v2 Announce Type: replace-cross Abstract: A quantum-mechanical description is presented for the three-body physics of shielded dipolar molecules, including a prediction of observable Efimov physics. Despite the anisotropic and long-range nature of the interaction, shielding enables a regime in which universality emerges already at the two-body level and extends to the three-body sector, where Efimov physics emerges. On the negative side of the scattering-length resonance, computed trimer binding energies display the characteristic scaling expected for Efimov resonances. Finally, the sudden approximation can be used to create trimer bound states, starting from positive energy trap states as a way to create or detect these molecular trimers. Moreover, the three-body parameter expressed in dipolar units is found to be universal.

18.
arXiv (quant-ph) 2026-06-19

Discrimination of genuinely nonlocal sets without entanglement in multipartite systems

arXiv:2606.20380v1 Announce Type: new Abstract: Genuine nonlocality arises when a set of multipartite orthogonal states is locally indistinguishable under any bipartition of the subsystems. The entanglement-assisted discrimination of such genuinely nonlocal orthogonal product sets has attracted significant attention in quantum information. Based on the criterion of local irreducibility, genuine nonlocality is classified into Type I (reducible) and Type II (irreducible). We present entanglement-assisted discrimination schemes for both types of genuinely nonlocal sets that use minimal resources. For low-dimensional cases, Type I sets require only a single EPR pair, whereas Type II sets necessitate only one GHZ state. We extend these protocols to higher-dimensional systems: the discrimination of Type I sets requires only one maximally entangled state in a two-qutrit system, while that of Type II sets similarly demands a single maximally entangled state in a three-qutrit system. For $n$-partite ($n > 3$) systems, Type I sets continue to require only one maximally entangled state, whereas Type II sets necessitate just one additional EPR pair compared to their Type I counterparts. These results provide a robust framework for the efficient discrimination of genuinely nonlocal sets using minimal quantum resources.

19.
arXiv (quant-ph) 2026-06-16

Counterdiabatic Raman Atom Optics for Compact High-Sensitivity Gravimetry

arXiv:2606.16945v1 Announce Type: new Abstract: Large-momentum-transfer (LMT) atom interferometry provides a route toward enhanced inertial sensitivity in compact quantum sensors, but its scalability is limited by the accumulation of pulse-transfer errors across long Raman pulse sequences. We investigate theoretically the use of stimulated Raman shortcut-to-adiabatic passage (STIRSAP) for high-fidelity LMT atom optics in a Mach–Zehnder interferometer geometry. The counterdiabatic correction is encoded directly into the Raman pulse envelopes, eliminating the need for auxiliary microwave or radio-frequency control fields. Numerical simulations based on an effective Raman model show that $1~\mu\mathrm{s}$ STIRSAP pulses achieve single-pulse transfer fidelities of $F_\pi = 0.99902$ while maintaining negligible pulse-time overhead even at high momentum order. We analyze the resulting tradeoff between interferometric phase enhancement and compound contrast decay and identify an unconstrained shot-noise optimum near $n\approx270$. The analysis further shows that practical operation at extreme LMT order is constrained by wave-packet separation, vibration noise, Doppler detuning, and accumulated systematic effects rather than by pulse duration itself. These results establish superadiabatic Raman control as a promising approach for scalable high-fidelity atom optics and clarify the physical limitations governing compact high-order atom interferometers.

20.
arXiv (CS.LG) 2026-06-11

Characterizing the Impact of NVFP4 Quantization for Low-Power Edge AI Deployment

arXiv:2606.06527v3 Announce Type: replace-cross Abstract: Energy-efficient neural-network inference at the edge requires reducing arithmetic cost, memory traffic, computation energy, and storage overhead while maintaining acceptable accuracy. This paper presents an ablation-focused study of NVFP4 quantization for edge-efficient neural networks, with emphasis on the relationship between activation precision, weight precision, block-size scaling, retraining, and model accuracy. NVFP4 activations are represented using 4-bit FP4 data, an FP8 block scale, and an FP32 tensor scale, enabling ultra-low precision inference while preserving activation dynamic range. A block-size ablation over six edge-efficient models shows that block size B = 16 provides a practical accuracy/storage trade-off, requiring only 4.5078 bits per input for N = 4096. A weight precision ablation further shows that FP8 and FP16 weights provide only modest gains over FP4 weights under the same NVFP4 activation path, suggesting that activation quantization and scaling dominate much of the accuracy behavior. To isolate the benefit of the NVFP4 data type, this work compares conventional unscaled FP4 activation inference and NVFP4 activation inference with and without retraining. The results show that conventional FP4 inference collapses accuracy for most compact models, while NVFP4 without retraining already recovers substantial accuracy by restoring activation dynamic range through FP8 block scaling and FP32 tensor scaling. When combined with retraining, NVFP4 achieves the best accuracy across the evaluated models, demonstrating the effectiveness of scaling-aware FP4 (NVFP4) inference. These findings provide general design guidance for hardware-software co-design of low power edge inference across a broad range of accelerator platforms, including GPUs, Tensor Cores, FPGAs, domain-specific AI accelerators, near-memory computing systems, and emerging edge-computing architectures.

21.
arXiv (CS.LG) 2026-06-12

Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design

arXiv:2601.09693v3 Announce Type: replace Abstract: Structure-based and ligand-based computational drug design have traditionally relied on disjoint data sources and modeling assumptions, limiting their joint use at scale. In this work, we introduce Contrastive Geometric Learning for Unified Computational Drug Design (ConGLUDe), a single contrastive geometric model that unifies structure- and ligand-based training. ConGLUDe couples a geometric protein encoder that produces whole-protein representations and implicit embeddings of predicted binding sites with a fast ligand encoder, removing the need for predefined pockets. By aligning ligands with both global protein representations and multiple candidate binding sites through contrastive learning, ConGLUDe supports ligand-conditioned pocket prediction in addition to virtual screening and target fishing, while being trained jointly on protein-ligand complexes and large-scale bioactivity data. Across diverse benchmarks, ConGLUDe achieves competitive zero-shot virtual screening performance, substantially outperforms existing methods on a challenging target fishing task, and demonstrates state-of-the-art ligand-conditioned pocket selection. These results highlight the advantages of unified structure-ligand training and position ConGLUDe as a step toward general-purpose foundation models for drug discovery.

22.
arXiv (CS.AI) 2026-06-18

RankGraph-2: Lifecycle Co-Design for Billion-Node Graph Learning in Recommendation

arXiv:2606.18379v1 Announce Type: cross Abstract: Graph-based retrieval at billion-node scale requires jointly solving three tightly coupled problems – graph construction, representation learning, and real-time serving – yet existing work addresses each in isolation. We present RankGraph-2, a framework deployed at Meta that co-designs all three lifecycle stages for similarity-based retrieval (U2U2I and U2I2I), where each stage's requirements shape the others. Serving requires a co-learned cluster index to avoid expensive online KNN – this pushes index co-training into the training objective. Training benefits from the observation that similarity-based retrieval tolerates pre-computed neighborhoods, eliminating online graph infrastructure – this requires construction to produce self-contained data. Construction must also support hour-level refresh for item coverage. Acting on these cascading requirements, RankGraph-2 reduces hundreds of trillions of edges to hundreds of billions via subsampling with popularity bias correction, pre-computes multi-hop neighborhoods via personalized PageRank, and co-learns a residual-quantization cluster index that reduces serving computational cost by 83%. This lifecycle co-design enables a simple architecture to achieve 3.8 x higher recall than a GAT + Deep Graph Infomax model on a bipartite graph and 2.1 x higher than PyTorch-BigGraph on item retrieval. RankGraph-2 delivers up to +0.96% CTR and +2.75% CVR, and has powered 20+ retrieval launches across major surfaces.

23.
arXiv (quant-ph) 2026-06-15

Extensible Fluxonium Architecture Using Tunable Couplers with Low Shunt Capacitance

arXiv:2606.01647v2 Announce Type: replace Abstract: Fluxonium qubits have demonstrated high-fidelity operations and long coherence times in small-scale systems, highlighting their promise for quantum computing. However, large-scale integration into a high-performance two-dimensional (2D) qubit array remains the central challenge for practical applications. In this work, we introduce an extensible architecture for scaling up fluxonium qubits in 2D grids. To address the key challenges, namely achieving controllable strong interaction and high connectivity for qubits featuring small shunting capacitors (footprints), we propose using low-shunt-capacitance couplers to enable tunable interactions between fluxonium qubits. When embedded into 2D square lattices, large couplings can be achieved even with relatively small coupling capacitances, thus enabling multiple connections with sufficient capacitance budget. We further propose coupler realizations based on generalized flux qubit circuits, specifically the quarton and the fluxonium, and demonstrate that both enable fast, high-fidelity gates with low spectator errors, while supporting multiple connections on 2D grids.

24.
arXiv (quant-ph) 2026-06-19

Battery-Explicit Thermodynamic Witnesses of Bell Post-Quantumness

arXiv:2605.09149v3 Announce Type: replace Abstract: We introduce a battery-explicit thermodynamic witness of post-quantum Bell correlations. In each round, a single supplied excitation is routed into an explicit two-level battery if and only if a Bell-game condition is satisfied. The routing operation is implemented by an energy-preserving controlled SWAP, with all logical control registers taken to be degenerate. Thus the correlation resource does not create energy; it only determines the probability that the supplied excitation reaches the battery. The construction is first formulated for finite two-player XOR games. For any such game, the mean battery charge is exactly the game success probability multiplied by the battery gap. Optimizing over local, quantum, or nonsignalling behaviours therefore turns the corresponding game values into local, quantum, or nonsignalling thermodynamic ceilings. For the CHSH game, Tsirelson's bound becomes a strict quantum ceiling on the mean battery charge, while a PR-box behaviour reaches the single-excitation cap. The witness is trusted-module rather than device-independent: it assumes calibrated Hamiltonians, correct classical wiring, and a trusted energy-preserving battery module. We also discuss a reversible-controller implementation, finite-statistics certification from work data, robustness to imperfect battery readout, and cyclic bookkeeping showing that no positive net work is obtained once fuel restoration and memory erasure are included.

25.
arXiv (CS.CL) 2026-06-19

The ACUTE Protocol: Operationalizing Language Model Activations for Better Calibration, Utility, and Trust

As language models improve and become increasingly deployed to solve a variety of tasks, trustworthiness becomes essential. Calibration is a good proxy for trust: well-calibrated confidence estimates help inform the risk versus reward tradeoff when trusting a specific model output. Unfortunately, even as models improve, they remain poorly calibrated, often biasing towards overconfidence. Additionally, calibration can be gamed: a policy that always predicts the base rate is perfectly calibrated, but completely uninformative. To resolve this, we develop a new metric, expected utility renormalized by the oracle (EURO), that balances calibration and informativeness. We also propose a general-purpose activation-based confidence, utility, and trust estimation protocol (ACUTE) to appropriately adjudicate uncertainty. The ACUTE protocol provides flexible, sample-efficient, and compute-efficient confidence estimators for 3 tasks including multiple choice question answering, tool-calling, and scientific document summarization across 6 models from 4 model families. ACUTE outperforms strong baselines on EURO, while maintaining low calibration error. Taken together, our work shows that equipping LLMs with the ACUTE protocol can improve calibration, utility, and trustworthiness in numerous settings.