Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-17

Full $\Gamma-$expansion for the level-two large deviation rate functionals of non-reversible one-dimensional diffusions with periodic boundary conditions

arXiv:2606.17859v1 Announce Type: new Abstract: Consider the diffusion process \begin{equation*} dX_{\epsilon}(t) = \mss b(X_{\epsilon}(t)) \, dt + \sqrt{2\, \epsilon\, \mss a(X_\epsilon(t))} \, dW_{t}, \end{equation*} on the one-dimensional torus $\bb T = [0,1)$. Here $\epsilon$ is the temperature, $W_{t}$ a Brownian motion on $\bb T$ and $\mss a$, $\mss b$ functions of class $C^{2}(\bb T)$ satisfying further conditions. Denote by $\mss P(\bb T)$ the set of probability measures on $\bb T$ equipped with the weak topology, and by $\ms I_{\epsilon}\colon \mss P(\bb T)\to [0,+\infty)$ the level two large deviation rate functional of the diffusion $X_{\epsilon}(\cdot)$. We derive a full $\Gamma-$expansion of $\ms I_{\epsilon}$, as $\epsilon \to 0$, expressing it as \begin{equation*} \ms I_{\epsilon} = \frac{1}{\epsilon} \;\ms J^{(-1)} \; +\; \ms J^{(0)} \;+\; \sum_{p=1}^{\widehat{\mf q}}\frac{1}{\theta^{(p)}_{\epsilon}}\;\ms J^{(p)}\,, \end{equation*} where $\ms J^{(-1)}$, $\ms J^{(0)}$, $\ms J^{(p)} \colon \mss P(\bb T)\to [0,+\infty]$ represent rate functionals, independent of $\epsilon$, and $\theta^{(p)}_{\epsilon}$ are the time-scales at which the Markov process $X_{\epsilon}(\cdot)$ exhibits a metastable behaviour.

02.
medRxiv (Medicine) 2026-06-18

AlphaGenome identifies a deep intronic variant in a family with PLA2G6-associated neurodegeneration: Closing the diagnostic gap in rare genetic diseases

A molecular diagnosis remains out of reach for a substantial subset of patients with clinically recognizable Mendelian disorders, even after comprehensive next-generation sequencing. Causal variants in non-coding regions are difficult to detect and interpret using standard pipelines. Deep intronic variants that disrupt splicing are a known but underexplored source of pathogenic alleles, and systematic tools to evaluate them at scale have only recently emerged. We aimed to resolve an incomplete genetic diagnosis in two siblings with early-onset parkinsonism, prominent neuropsychiatric features, and autonomic dysfunction consistent with PLA2G6-associated neurodegeneration (PLAN), an autosomal recessive condition. Prior clinical exome sequencing, genome sequencing, Multiplex Ligation-dependent Probe Amplification (MLPA), and long-read sequencing had identified only a single heterozygous PLA2G6 missense variant, c.2132C>G (p.Pro711Arg). We used AlphaGenome to score 91 non-coding variants shared among the affected siblings and their father within 1 megabase of the PLA2G6 locus. The deep-learning model identified an intronic variant (c.2034+355G>A) that was predicted to create a cryptic splice acceptor site that could result in inclusion of a 160-bp cryptic exon. Tissue-specific predictions indicated the aberrant splicing would be detectable in blood, confirmed by junction-spanning RNA-seq reads from an unrelated carrier. This analysis completed a compound heterozygous PLAN diagnosis nearly two decades after symptom onset and demonstrates the utility of sequence-to-function models. Systematic integration of tools like AlphaGenome into rare disease workflows offers a practical, low-barrier route to closing the diagnostic gap for patients with compelling Mendelian phenotypes and incomplete genetic diagnoses.

03.
arXiv (quant-ph) 2026-06-16

Bath memory as a precision resource in quantum transport

arXiv:2606.17026v1 Announce Type: new Abstract: Structured baths can reshape transport fluctuations in mesoscopic quantum devices, yet a predictive criterion for when this enhances precision has been lacking. We propose a route towards such precision advantages by utilizing bath memory in coherent fermionic transport through a noninteracting quantum-dot chain. Using the Landauer-Büttiker formalism, we derive a dual impedance-matching condition that synchronizes the conductor mode splitting, boundary dissipation, and bath bandwidth, and sustains constructive multimode interference across the transmission window. The analytical predictions for the optimal bath bandwidths show excellent agreement with exact nonequilibrium Green's function calculations of the transport for Lorentzian, Gaussian, and Newns spectral densities. The prescription yields an optimal bath bandwidth at which the current Fano factor is minimized and the thermodynamic and kinetic precision coefficients are simultaneously enhanced beyond their Markovian limits. The alignment of the optimal precision regime with the experimentally accessible current Fano factor minimum thus provides a practical strategy for designing precision-enhanced transport in mesoscopic platforms such as semiconductor quantum-dot arrays and ultracold fermionic channels.

04.
bioRxiv (Bioinfo) 2026-06-16

THEOBROMA: an aggregated open database of 1.13 million natural products with per-compound license auditing, three-tier classification, and stereochemistry-aware deduplication

Natural products remain one of the most productive sources of pharmacologically active compounds for drug discovery, yet the current open aggregator landscape attributes licenses at database rather than compound granularity, with consequences that have become tangible as the field grows. A recent relicensing event in one constituent source (the September 2024 transition of the Natural Products Atlas to CC BY-NC 4.0) demonstrates how database-level licensing propagates across an aggregate and motivates the per-compound audit framework presented here. The same peer cohort separately leaves classification provenance and stereoisomer-family relations coarser than either layer warrants. THEOBROMA, accessible at url{https://theobroma.l3s.uni-hannover.de}, integrates 1{,}133{,}004 natural products from 29 open sources under a per-compound license audit that resolves each compound's license tier across all attesting sources under a most-restrictive-wins rule, identifying 900{,}170 compounds (79.4%) under open-use licenses and exposing the per-source attestation chain and resolved tier through a dedicated audit endpoint and a query-time license filter. A three-tier classification stratifies 89.3% coverage into 35.1% curated, 43.9% high-confidence inferred, and 10.3% exploratory tiers, with 486{,}215 stereoisomer families preserved by full 27-character InChIKey deduplication and exposed via a dedicated texttt{/api/stereoisomers/} endpoint and a radial-family display. Per-compound license provenance is the primary differentiator. Classification stratification and stereoisomer-family exposure add finer-grained access to two related axes, supporting license-compatible virtual screening and isomer-specific bioactivity analysis at corpus scale. As an evolving open resource, THEOBROMA pairs continuous pipeline maintenance with interactive geographic, taxonomic, and chemical-space exploration.

05.
arXiv (CS.CL) 2026-06-18

Enhancing Decision-Making with Large Language Models through Multi-Agent Fictitious Play

Large language model (LLM)-based multi-agent systems (MAS) have demonstrated great potential in solving tasks with execution complexity, by distributing subtasks across cooperative agents. However, this divide-and-conquer paradigm falls short on decision-making tasks that are also prevalent in the real world. These tasks require simultaneous reasoning from the stances of all involved stakeholders whose decisions are mutually dependent and thus cannot be solved in isolation. We characterize this challenge as stance entanglement, a form of decision complexity distinct from execution complexity. To address it, we propose Multi-Agent Fictitious Play (MAFP), a novel MAS paradigm that represents stakeholder stances as agents and formulates decision-making as an equilibrium-seeking process. Built on the game-theoretic principle of fictitious play, MAFP iteratively updates each agent's decision by best responding to the empirical mixture of other agents' past decisions. This enables agents to expose and address one another's weaknesses, progressively improving decision quality and robustness. We evaluate MAFP on challenging decision-making tasks that test the capability of deciding strategies for competitive scenarios prior to acting. MAFP outperforms both single-round and multi-round baselines on two complementary metrics, tournament strength and robustness, demonstrating its effectiveness in addressing stance entanglement.

06.
arXiv (CS.CL) 2026-06-16

CODA-BENCH: Can Code Agents Handle Data-Intensive Tasks?

Advanced agents are increasingly demonstrating the potential to operate as autonomous engineers, creating a growing demand for evaluation benchmarks that capture the complexity of real-world development. Such environments typically involve both complex code and large-scale data (i.e., file system). However, existing benchmarks usually evaluate code-centric or data-centric capabilities in isolation, leaving a clear gap with real development scenarios. In this paper, we bridge this gap by introducing CODA-BENCH, the first benchmark to jointly evaluate code and data intelligence in a data-intensive environment. We construct a data-intensive Linux sandbox based on the Kaggle ecosystem (containing hundreds of datasets), where agents must actively explore complex file hierarchies to identify relevant resources and generate code for data-driven analytical tasks. CODA-BENCH comprises 1,009 tasks spanning 31 communities, with each task environment containing an average of 980 files, simulating realistic data scale and noise. Evaluations of advanced agents reveal that even top-performing systems struggle to effectively integrate data discovery with code execution, achieving a success rate of only 61.1%. These results highlight a substantial gap in current agentic capabilities for data-intensive tasks and point to promising directions for future research.

07.
arXiv (CS.CV) 2026-06-12

Budget-Constrained Step-Level Diffusion Caching

Step-level caching accelerates diffusion models by exploiting temporal redundancy across denoising steps. Existing methods make per-step cache decisions using threshold-based heuristics, without directly optimizing for final output quality. As a result, their inference latency varies across inputs and is difficult to control at deployment. In this work, we propose BudCache, which inverts this formulation: rather than letting per-step error thresholds dictate the runtime cost, we fix the compute budget in advance and search for the cache policy that best preserves the final output. To tackle the combinatorial complexity of step selection, we combine Simulated Annealing with deterministic Hill Climbing. This offline search identifies high-quality cache policies within minutes and introduces no online search or thresholding overhead during inference. When the compute budget is very tight, we further introduce cache-aware schedule alignment, which adapts the time discretization to the selected cache policy to reduce cache-induced trajectory mismatch. Experiments on FLUX.1-dev and Wan2.1 show that BudCache achieves better generation quality than heuristic caching baselines under the same inference budgets. Code is available at https://github.com/Westlake-AGI-Lab/BudCache

08.
arXiv (CS.CV) 2026-06-16

DreamX-World 1.0: A General-Purpose Interactive World Model

DreamX-World 1.0 is a general-purpose interactive text/image-to-video world model for controllable long-horizon generation. It supports camera navigation, revisits to previously observed regions, and promptable events across photorealistic, game-style, and stylized domains. Our data engine combines camera-accurate Unreal Engine rendering, action-rich gameplay recordings, and real-world videos with recovered camera geometry. For camera control, we introduce E-PRoPE, a lightweight variant of projective positional encoding that retains PRoPE's projective camera geometry while applying camera-aware attention to spatially reduced tokens. We convert a bidirectional video generator into a few-step autoregressive world model using causal forcing, DMD-style distillation, and long-rollout training. Training on self-generated long-horizon contexts exposes the model to its own generated history and reduces the style and color drift that accumulates across autoregressive chunks. Memory-Conditioned Scene Persistence retrieves earlier views through camera-geometry-based retrieval, while residual recycling makes the conditioning path less sensitive to imperfect memory latents. Event Instruction Tuning adds composable event control, and reinforcement learning alignment recovers camera control and visual quality after distillation. With mixed-precision DiT execution, residual reuse, 75\%-pruned VAE decoding, and asynchronous pipeline parallelism, DreamX-World 1.0 reaches up to 16\,FPS on eight RTX\,5090 GPUs. On our 5-second basic evaluation, DreamX-World 1.0 achieves a camera-control score of 73.75 and an overall score of 84.76, outperforming HY-WorldPlay 1.5 and LingBot-World in overall score, which achieve 80.79 and 80.45, respectively.

09.
arXiv (CS.CV) 2026-06-16

Temporally Consistent and Controllable Video Generation of 2D Cine CMR via Latent Space Motion Modeling

Cine cardiac magnetic resonance is the gold standard for assessing cardiac function, but the scarcity of public datasets limits the development of advanced data-driven models. To address this limitation, we propose a generative method for synthesizing temporally coherent and anatomically consistent cardiac sequences. Our text-to-video framework decouples cardiac spatial structure from temporal motion. First, a fine-tuned diffusion model synthesizes an initial frame from a clinical text prompt, controlling anatomical features. Then, a latent flow model conditioned on a cardiac phase embedding generates the complete cardiac motion, ensuring spatial consistency and temporal control. Our model generates anatomically and pathologically diverse sequences with high temporal coherence and strong fidelity to input prompts, achieving a FID of 31.68 for image realism and a CLIP score of 31.04 for text-image alignment. These experimental results highlight its potential to produce high-fidelity, on-demand medical data, offering a scalable solution to data scarcity.

10.
medRxiv (Medicine) 2026-06-16

MRMU: A New Paradigm for Mendelian Randomization by Accounting for Measured Covariates and Unmeasured Confounders

Mendelian randomization (MR) is a powerful approach for causal inference, however, its reliability is frequently compromised by unadjusted covariates and unmeasured confounders, such as unmeasured pleiotropy and sample structure. To address these challenges, we introduce MRMU, a novel paradigm for the MR framework. Unlike traditional single-variable or multivariable MR methods, MRMU selects instrumental variables only from the exposure of interest and estimates one exposure effect at a time, while jointly accounting for measured covariates and unmeasured confounders. This design improves the reliability of MR analyses. In simulations and real data, MRMU achieved better type I error control, higher statistical power, and more accurate effect estimation than existing MR methods. Applying to coronary artery disease (CAD), MRMU identified robust cardiometabolic risk factors, including LDL-C, APOB, systolic blood pressure, body mass index, and smoking initiation, with consistent evidence across multiple CAD datasets. In contrast, traits such as HDL-C, height, and educational attainment, which were found to be significant by existing MR methods, were no longer supported by MRMU. MRMU further supported blood pressure-related traits, rather than lipid traits, as the more relevant pathway linking urate to CAD. Finally, by integrating large-scale plasma proteomics data, MRMU identified candidate CAD drug targets beyond established HMGCR- and PCSK9-related pathways, highlighting its utility for therapeutic target prioritization.

11.
arXiv (CS.LG) 2026-06-16

A nonparametric two-sample test using a parametric integral probability metric

arXiv:2606.16941v1 Announce Type: cross Abstract: Detecting distributional differences between two independent samples is a fundamental problem in statistics and machine learning. Nonparametric two-sample testing provides a principled framework for determining whether two samples are drawn from the same underlying distribution, without assuming any specific parametric form for the distribution. In this study, we propose a new two-sample test statistic based on a newly introduced integral probability metric (IPM), using a specially designed parametric discriminator class with a single node of a neural network. We show that the resulting test statistic, called PReLU-IPM, is nonparametric and establish theoretical guarantees for the associated two-sample testing procedure, PReLU-TST, including its consistency and asymptotical equivalence to nonparametric IPM-based tests under regularity conditions. By analyzing multiple simulated and real benchmark datasets, we demonstrate that PReLU-TST achieves higher power across a range of alternatives or performs comparably to its competitors, for finite samples.

12.
arXiv (CS.CV) 2026-06-16

LOCUS: Local Visual Cue Search for Enhancing Fine-Grained Perception in Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) remain unreliable on fine-grained visual perception, even when high-resolution inputs preserve the necessary local details. We identify this limitation as visual context rot: decisive evidence may exist in the full image, yet fail to be reliably selected and used amid redundant visual context. We propose LOCUS (LOcal visual CUe Search), a training framework that teaches MLLMs to internalize local evidence search through a verifiable proxy task. During training, LOCUS provides a local crop as a visual cue and optimizes the model to recover its spatial support in the full image using an IoU-based reward. The visual cue is used only during training, leaving the standard image-question inference interface unchanged. Experiments across fine-grained perception, hallucination, general understanding, and reasoning benchmarks show that LOCUS improves localization-sensitive visual understanding while preserving broad capabilities. Attention analyses further indicate stronger focus on task-relevant evidence regions, suggesting that training-time visual cue search provides an effective route to internalized fine-grained evidence selection.

13.
arXiv (math.PR) 2026-06-11

Continuous stochastic flows driven by white noise and their duals

Authors:

arXiv:2606.12143v1 Announce Type: new Abstract: We study a class of continuous stochastic flows driven by a space-time white noise and characterize their dual flows by explicit stochastic differential equations. A key ingredient of the proof is the convergence of solutions under coefficient approximations. As an application, we derive the dual flows in two illustrative examples, the squared Bessel flow and the Jacobi flow. We also introduce a new model of polynomially self-repelling (PSR) flow and show that it enjoys a self-duality property.

14.
arXiv (CS.CV) 2026-06-12

HairPort: In-context 3D-aware Hair Import and Transfer for Images

Transferring hairstyles between images is an important but challenging task in computer graphics, computer vision, and visual effects. It enables users to explore new looks without physically altering their hair, with applications in virtual try-on systems, augmented reality, and entertainment. Most prior works operate best under small pose gaps, and they fall short under large viewpoint and scale differences, where missing hair content must be synthesized rather than transferred. We propose HairPort, a 3D-aware hairstyle transfer framework that attempts to solve these issues by explicitly separating hair removal from transfer and enforcing geometric consistency before synthesis. We introduce a Bald Converter, which produces realistic bald versions of faces through LoRA-based in-context adaptation of FLUX.1 Kontext. To train our Bald Converter, we introduce a new dataset, Baldy, containing 6,000 paired bald and original images across diverse identities and conditions. We also use a 3D-Aware Transfer Pipeline that reconstructs and re-renders the reference hairstyle from the target viewpoint before compositing it onto the source image. Being 3D aware, our method supports large pose and scale discrepancies between the source and target. Finally, a conditional flow-matching generator synthesizes the transferred result from the bald source and geometry-aligned reference guidance. Together, our method enables accurate, pose-consistent, and identity-preserving hairstyle transfer, outperforming existing methods both qualitatively and quantitatively.

15.
arXiv (CS.AI) 2026-06-16

MADAR: An Address-Free Processor

arXiv:2606.15535v1 Announce Type: cross Abstract: In a modern processor, computing is the cheap part. Most of its area and energy go to addressing – moving operands to and from a register file and cache, and running the tags, ports, miss queues, and bypass networks that find a value where it was left. MADAR deletes that machinery by abolishing the address. All state circulates in rings of slots that advance one position per clock; instructions and data ride in the same slots; a value is named by its place in an orbit – a \rp{} coordinate – not by an address; a fixed station computes when a circulating instruction sweeps past its operands, on a schedule set at compile time; and a hierarchy of rings of increasing period replaces the cache hierarchy, movement between them scheduled rather than triggered by a miss. No prior circulating-store, dataflow, or statically scheduled machine combines all four of these. We define the execution model, validate it in a cycle-accurate register-transfer-level implementation, show it compilable – a constructive scheduler emits programs cross-checked against the implementation – and price it with a first-order energy model. The payoff is clearest for AI acceleration: the multiply-accumulate at the heart of every matmul and convolution compiles to a streaming form whose energy per operation stays flat as the reduction grows, and the operand reuse that makes matrix multiplication efficient is carried by the ring-period hierarchy – the memory hierarchy doing by rotation what a cache does by tags. MADAR is a new design point for any computation whose data movement is known before the program runs.

16.
arXiv (CS.LG) 2026-06-11

Family-Aware Residual Architecture for Predicting Quantum Circuit Simulation Performance

arXiv:2606.11620v1 Announce Type: cross Abstract: Approximate tensor-network simulators enable classical simulation of quantum circuits beyond the reach of exact methods, but selecting optimal approximation parameters – such as bond dimension thresholds – remains a costly trial-and-error process. We present a family-aware neural architecture that predicts both the minimum approximation threshold required to achieve target fidelity and the expected wall-clock runtime for quantum circuit simulation, given only the circuit's OpenQASM description and execution context. Our key insight is that quantum circuits from different algorithmic families (e.g., QFT, Grover, VQE) exhibit fundamentally distinct simulation cost profiles due to their differing entanglement structures. We employ family-conditioned residual corrections – additive, family-specific adjustments atop a shared backbone, drawing on established conditional computation techniques – enabling the model to capture both universal circuit properties and algorithmic nuances. The architecture incorporates a pretrained family classifier (97.5% accuracy) and domain-informed algorithm fingerprint features derived from gate-composition heuristics. Evaluated on circuits spanning 7–130 qubits across 10 algorithm families, our system achieves 79.5% exact threshold accuracy (91.2% within one rung) and $R^2 = 0.82$ runtime correlation, with inference completing in approximately 50 ms – replacing trial-and-error simulation runs that may take minutes to hours. Ablation studies confirm that family-aware modeling provides the single largest performance improvement (+3.2 percentage points), validating the hypothesis that algorithm family is a first-class feature for simulation cost prediction.

17.
arXiv (CS.LG) 2026-06-16

Early Anomaly-Onset Detection based on Wigner–Ville Distribution Slice Spectra: A Transmission-Grid Test Case

arXiv:2606.15856v1 Announce Type: cross Abstract: Operational disturbance monitoring in power networks requires decisions to be made from waveform windows as they arrive, rather than from completed records after the event. This study evaluates full-vector Wigner–Ville Distribution Slice (WVDS) spectra for sequential anomaly-onset detection in high-voltage grid-voltage waveforms. The approach keeps the bilinear midpoint interaction structure of the Wigner–Ville distribution and represents each 128-sample voltage window by a 128-dimensional slice spectrum, avoiding manually selected fault-frequency markers. WVDS is used with a baseline-normalized deviation (BND) score and is compared against the BND of Fast Fourier Transform (FFT-BND), raw-window autoencoders, FFT autoencoders, and WVDS autoencoders under the same thresholding and three-window persistence rule. A synthetic autoencoder–clustering teacher is used to select RTE fault records that start from an initially normal region and then transition to anomalous behavior. On the filtered test set, FFT-BND achieves the highest sensitivity, whereas WVDS-BND provides the lowest false-alarm operating point, reducing record-level pre-onset false alarms to 0.69%. The autoencoder comparison follows the same selectivity pattern: WVDS reconstruction decreases false alarms relative to FFT reconstruction but misses more examples. The results indicate that preserved WVD cross-term information can form a selective representation for online grid-waveform anomaly monitoring when false alarms are costly.

18.
arXiv (CS.CV) 2026-06-11

SG2Loc: Sequential Visual Localization on 3D Scene Graphs

Visual localization in complex indoor environments remains a critical challenge for robotics and AR applications. Sequential localization, where pose estimates are refined over time, is important for autonomous agents. However, traditional methods often require storing extensive image databases or point clouds, leading to significant overhead. This paper introduces a novel, lightweight approach to sequential visual localization using 3D scene graphs. Our method represents the environment with a compact scene graph, where nodes represent objects (with coarse meshes) and edges encode spatial relationships. For each image in the localization phase, we extract per-patch semantic features, predicting object identities. Localization is performed within a particle filter framework. Each particle, representing a camera pose, projects the coarse object meshes from the scene graph into the image, assigning object identities to patches based on visibility. The similarity of the per-patch features, in the input image, and object features from the scene graph determines the weight of a particle. Subsequent images are incorporated sequentially, refining the pose estimate. By leveraging a compact scene graph and efficient semantic matching, our method significantly reduces storage while maintaining performance on real-world datasets. The code will be available at https://github.com/DmblnNicole/sg2loc.

19.
arXiv (quant-ph) 2026-06-15

Quantum geometrical description of hole spin qubits far away from the $\Gamma$-point

arXiv:2606.14683v1 Announce Type: cross Abstract: Hole spin qubits provide one of the leading platforms for spin-based quantum computing due to their large intrinsic spin-orbit interaction (SOI), which enables fast electrical manipulation. The SOI of planar quantum dots has mostly been investigated in theoretical studies by examining the SOI already present in the two-dimensional hole gas (2DHG). Here, we study the SOI created by the in-plane confinement by deriving non-perturbative effective Hamiltonians numerically for hole spin qubits. We find that the quantum geometry of the 2DHG naturally emerges, leading to a meaningful non-perturbative definition of pseudospin valid far away from the $\Gamma$-point. The SOI of the 2DHG and of the in-plane confinement have different forms; therefore, they cannot be turned off simultaneously, ruining the perfect spin-orbit switch functionality of spin qubits. We construct effective Hamiltonians using the symmetry approach for various low-dimensional hole systems: (i) a heavy-hole confined in a SiGe/Ge/SiGe heterostructure, (ii) a light-hole confined in SnGe/Ge, (iii) a gate-defined nanowire in SiGe/Ge/SiGe, and (iv) a hole confined in a Ge/Si core/shell nanowire. The non-perturbative effective Hamiltonians provide results with excellent agreement with the full Hamiltonians.

20.
arXiv (CS.AI) 2026-06-12

Fin-RATE: A Real-world Financial Analytics and Tracking Evaluation Benchmark for LLMs on SEC Filings

arXiv:2602.07294v4 Announce Type: replace-cross Abstract: With the increasing deployment of Large Language Models (LLMs) in the finance domain, LLMs are increasingly expected to parse complex regulatory disclosures. However, existing benchmarks often focus on isolated details, failing to reflect the complexity of professional analysis that requires synthesizing information across multiple documents, reporting periods, and corporate entities. Furthermore, these benchmarks do not disentangle whether errors arise from retrieval failures, generation inaccuracies, domain-specific reasoning mistakes, or misinterpretation of the query or context, making it difficult to precisely diagnose performance bottlenecks. To bridge these gaps, we introduce Fin-RATE, a benchmark built on U.S. Securities and Exchange Commission (SEC) filings and mirroring financial analyst workflows through three pathways: detail-oriented reasoning within individual disclosures, cross-entity comparison under shared topics, and longitudinal tracking of the same firm across reporting periods. We benchmark 17 leading LLMs, spanning open-source, closed-source, and finance-specialized models, under both ground-truth context and retrieval-augmented settings. Results show substantial performance degradation, with accuracy dropping by 18.60% and 14.35% as tasks shift from single-document reasoning to longitudinal and cross-entity analysis. This degradation is associated with increased comparison hallucinations, temporal and entity mismatches, and is further reflected in declines in reasoning quality and factual consistency–limitations that existing benchmarks have yet to formally categorize or quantify.

21.
arXiv (quant-ph) 2026-06-17

Engineering entanglement and transport in interacting quantum walks with tailored potentials

arXiv:2606.17825v1 Announce Type: new Abstract: Controlling the interplay between particle propagation and quantum correlation generation is a central challenge in quantum transport. Here, we investigate two distinguishable continuous-time quantum walkers evolving on parallel one-dimensional lattices, interacting via distance-dependent potentials. While on-site interactions reproduce the typical bosonic behaviour, extending the interaction to a linear potential over multiple neighbors introduces controlled Bloch-like oscillations and shifts the bound-pair regime to stronger couplings. More generally, we explore a Coulomb-like interaction parameterized by strength, spatial scaling, and decay rate. This reveals a rich phase diagram including four distinct dynamical regimes: (i) a high-entropy, oscillatory regime akin to a linear potential; (ii) a strongly localized, bound-pair regime; (iii) a novel intermediate regime combining near-ballistic spreading with strong correlations; and (iv) a weakly interacting, free-propagation regime. Notably, regime (iii) achieves concurrent optimization of transport efficiency and entanglement, offering a sweet spot for correlated quantum dynamics. Our results provide a tool for designing interaction-engineered quantum walks with potential applications in quantum information processing and simulations.

22.
Nature Medicine 2026-06-22

Biological aging and generational shifts in early-onset cancer risk

Authors:

Incidence of early-onset cancer is rising globally in recent generations, which underscores the need to elucidate the influence of emerging generational risk factors. Systemic and organ-specific aging reflects the cumulative impact of exposures and may provide an integrative and complementary approach to understand early-onset cancer risk. Here among 154,169 young adults from the United Kingdom Biobank, systemic aging measured by PhenoAge increased across birth cohorts, with 23% s.d. increase for those born 1965–1974 versus 1950–1954, and was associated with early-onset solid cancer risk (hazard ratio (HR)per s.d. 1.08; 95% confidence interval (CI), 1.03–1.13), driven by lung, gastrointestinal and uterine cancers, independent of genetic risks of aging and cancer. Patterns were consistent using alternative systemic aging measures, including the Klemera–Doubal method-defined age gap and metabolomic-based age gap. These findings were validated partially among 10,262 participants in the United States All of Us Research Program. Proteomics-based organ-specific aging analyses linked immune aging with early-onset lung cancer (HRper s.d. 1.89; CI, 1.20–2.97) and adipose tissue aging to early-onset colorectal cancer (HR 1.60; CI, 1.11–2.32). Greater age gap, reflecting more advanced biological aging relative to chronological age, may serve as a driver associated with risk of early-onset solid cancers, highlighting the importance of uncovering underlying mechanisms to guide effective prevention strategies. Analyses of population cohorts found that young adults exhibited earlier systemic and organ-specific aging, which was associated with increased risk of early-onset cancer compared with older adults born decades earlier.

23.
arXiv (CS.AI) 2026-06-15

LLM-Powered AI Agent Systems and Their Applications in Industry

arXiv:2505.16120v3 Announce Type: replace Abstract: The emergence of Large Language Models (LLMs) has reshaped agent systems. Unlike traditional rule-based agents with limited task scope, LLM-powered agents offer greater flexibility, cross-domain reasoning, and natural language interaction. Moreover, with the integration of multi-modal LLMs, current agent systems are highly capable of processing diverse data modalities, including text, images, audio, and structured tabular data, enabling richer and more adaptive real-world behavior. This paper comprehensively examines the evolution of agent systems from the pre-LLM era to current LLM-powered architectures. We categorize agent systems into software-based, physical, and adaptive hybrid systems, highlighting applications across customer service, software development, manufacturing automation, personalized education, financial trading, and healthcare. We further discuss the primary challenges posed by LLM-powered agents, including high inference latency, output uncertainty, lack of evaluation metrics, and security vulnerabilities, and propose potential solutions to mitigate these concerns.

24.
arXiv (CS.AI) 2026-06-19

Emyx: Fast and efficient all-atom protein generation

arXiv:2606.19377v1 Announce Type: cross Abstract: Computational enzyme design requires generating proteins that scaffold catalytic residues and ligands, a task that demands both geometric accuracy and structural diversity from the underlying generative model. Current all-atom generators inherit expensive architectures from structure prediction, leading to high training costs and limited sample diversity. We argue that much of this complexity is unnecessary for generators, which condition on sparse geometric constraints rather than rich co-evolutionary signals. Emyx is a 140M-parameter conditional flow matching model that concentrates capacity within standard transformer blocks, replacing heavy embedding stacks with lightweight conditional representations and sparse connectivity. We additionally derive an exact reparametrisation of the flow matching interpolant into the EDM noise-level framework, bridging flow matching training efficiency with state-of-the-art sampling methods designed for diffusion models without retraining. Despite being the smallest model, Emyx outperforms both Proteína-Complexa and RFdiffusion3 against the AME enzyme design benchmark across success rate under strict evaluation requiring both global fold recovery and catalytic geometry accuracy, structural novelty, scaffold diversity, and geometric validity, while training in just $682$ GPU-hours, roughly $4\times$ less than RFdiffusion3.

25.
arXiv (CS.AI) 2026-06-17

Trust-Aware Multi-Agent Traceability: Confidence-Calibrated Knowledge Graphs for Consistent Software Artifact Management

arXiv:2606.17203v1 Announce Type: cross Abstract: Multi-agent AI systems are increasingly used to automate software engineering tasks including requirements analysis, architecture design, test generation, and traceability linking. When these agents operate as a sequential pipeline over shared software artifacts, errors and low-confidence decisions made by upstream agents propagate to downstream stages, producing orphaned requirements, contradictory links, and compliance gaps that pose significant risks in safety-critical domains. We propose a trust-aware coordination framework where a shared knowledge graph serves as both centralized semantic memory and a coordination surface through which agents assess and build upon each other's contributions using calibrated confidence scores. Our approach introduces a two-stage traceability link prediction pipeline combining embedding-based retrieval with LLM-based multi-criteria analysis, a traceability seeding mechanism that enables comparison between derivation-time and validation-time confidence, and a consistency protocol governing pipeline interactions through confidence threshold gating, confidence divergence detection, and conflict resolution. We evaluate on an automotive software engineering case study measuring link prediction calibration, protocol effectiveness, threshold sensitivity, and the impact of traceability seeding. Ablation studies confirm that confidence calibration is essential for effective pipeline coordination.