Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
Nature Medicine 2026-06-10

Brain Health for Economic Resilience: a data-driven framework for the brain-positive economic transition

Announced in this Comment and in collaboration with Nature Medicine is the convening of the Brain Health for Economic Resilience Commission, a global, transdisciplinary effort to define, measure and operationalize brain health and cognitive capacity as foundational drivers of economic resilience.

02.
arXiv (CS.AI) 2026-06-17

Learning Red Agent Policy from Observations for Neurosymbolic Autonomous Cyber Agents

arXiv:2606.18223v1 Announce Type: cross Abstract: With sophisticated cyber-attacks becoming increasingly prevalent, modern networks require intelligent autonomous cyber-defense agents trained via Reinforcement Learning (RL). These agents employ neurosymbolic approaches such as behavior trees with learning-enabled components (LECs) to learn, reason, adapt, and implement security rules while maintaining critical operations. However, these autonomous networks are partially observable systems, i.e., the cyber-attacker's (red agent's) actions are not observable, making it difficult for the defender to predict red actions, learn red policies, or assess the attacker's intrusion levels. To address this, we propose a Policy Learning Technique using imitation learning to learn policies for partially observable RL agents with discrete states and discrete actions. We apply this technique in an autonomous cyber environment to predict red agent's actions from network observations and defender actions. Integrated with a neurosymbolic cyber-defense agent, our method effectively handles different red policies and achieves high prediction accuracy across diverse simulated scenarios.

03.
arXiv (CS.CL) 2026-06-17

PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience

As Large Language Model based agents enter autonomous scientific research, their ability to resist pseudoscience becomes increasingly important. Otherwise, such systems may rapidly generate plausible yet misleading studies that contaminate academic literature and erode trust in science. We present PseudoBench, an adversarial benchmark for evaluating whether agentic auto-research systems can identify and resist pseudoscientific narratives. PseudoBench contains 200 curated pseudoscientific claim-evidence pairs across five domains and evaluates agents through an end-to-end research pipeline from experiments to writing. Testing seven state-of-the-art agents, we find that current systems readily produce persuasive reports that align with pseudoscientific premises with near-zero refusal rates and the highest resistance of only 27.4%. Stronger agents risk packaging pseudoscience in more sophisticated scientific language, increasing its apparent credibility. These findings reveal an alarming capacity to fuel pseudoscience, calling for scientific alignment before widespread deployment.

04.
arXiv (CS.CL) 2026-06-12

Trait, Not State: The Durability of Reading Identity in Social Highlighting

Prior work on a social web highlighter located individuality in selection – which documents a person chooses to highlight – but measured it cross-sectionally. We ask the temporal question: is a reader's selection signature a trait or a state? We freeze each reader's first six months of highlighting as a profile and track its own-vs-other advantage on their later selections at growing gaps (to 24+ months), with negatives drawn from the same calendar era – so supply drift cannot masquerade as personal drift – at a coarse global level and at a fine level whose negatives and controls come from the reader's own interest neighborhood; the anchor cell reproduces the prior cross-sectional level (+0.188 vs +0.169), validating the harness. Four results. Within the same users, the fine-layer advantage shows no statistically detectable paired decline at any horizon (6-12 month retention R = 1.00 [0.85, 1.18], n = 212; the farthest bin is compatible with a modest decline; the only contrast whose interval excludes zero is the coarse layer at 12-24 months, about 13%). The signal is not reducible to repeated domains (~90% survives excluding all profile sources). Within-person drift is slow (a recent-half profile beats the old half by +0.042). Prospectively, personal profiles – even one built from a reader's earliest documents, median 20 months before evaluation – rank their next reads at roughly 3x the AP of every simple non-personal prior tested. We use "trait" operationally (a stable signature under continued engagement); the scope is heavy, long-tenured readers of one platform, and exposure is not separable from choice.

05.
arXiv (CS.CV) 2026-06-18

Spatially Stratified Distillation for Heterogeneous Radar Place Recognition

Scalable, all-weather place recognition increasingly relies on heterogeneous radar place recognition to bridge diverse hardware platforms. A notable application is matching queries from cost-effective 4D automotive radars against high-fidelity reference maps built by dense spinning radars. This process is fundamentally limited by the extreme sparsity (and narrow field-of-view) of the 4D sensor, which captures only a fraction of the structural density present in the spinning radar database. Prior efforts address this issue by unifying different radar signals. That is, projecting both signals into a common representational space. Yet, they suffer performance degradation in multi-session environments. In this paper, we propose spatially-stratified distillation (SSD); a strategy that replaces standard uniform distillation with an asymmetric spatial alignment derived directly from physical radar returns. In regions where both radars exhibit overlapping returns, SSD enforces strong feature alignment. Crucially, in sparse regions where the 4D student lacks returns but the teacher contains valid structure within the shared field of view, SSD applies heavily discounted distillation weights. Extensive evaluations of the recent HeRCULES dataset demonstrate that SSD significantly outperforms prior place recognition methods, achieving state-of-the-art results on its challenging dynamic sequences.

06.
arXiv (CS.LG) 2026-06-18

Toward Simultaneously Optimal Regret in U-Calibration

arXiv:2606.18527v1 Announce Type: cross Abstract: U-calibration studies online forecasting algorithms whose predictions can be consumed by any unknown downstream agent, guaranteeing sublinear regret simultaneously for all proper loss functions. Existing U-calibration algorithms achieve worst-case optimal $O(\sqrt{T})$ regret for every bounded proper loss, but they fail to adapt to easier losses: as we show, even for smooth losses such as squared loss, they incur $\Omega(\sqrt{T})$ regret instead of the optimal $O(\log T)$ regret. In this work, we show that this limitation is not inherent. Specifically, we design a single forecast algorithm that simultaneously achieves $\tilde O(\sqrt{T})$ regret for every bounded proper loss and $O(\log T)$ regret for every bounded smooth proper loss. More generally, our algorithm also attains logarithmic regret for losses that are smooth relative to the log-barrier, which include several non-Lipschitz examples. Our approach is based on a novel variant of Follow-the-Perturbed-Leader (FTPL) in which perturbations are applied directly in the prediction space using self-concordant noise. The resulting analysis also departs substantially from prior FTPL analyses due to the complex nature of this noise and may be of independent interest.

07.
arXiv (CS.LG) 2026-06-16

Smoothness Errors in Dynamics Models and How to Avoid Them

arXiv:2602.05352v3 Announce Type: replace Abstract: Modern neural networks have shown promise for solving partial differential equations over surfaces, often by discretizing the surface as a mesh and learning with a mesh-aware graph neural network. However, graph neural networks suffer from oversmoothing, where a node's features become increasingly similar to those of its neighbors. Unitary graph convolutions, which are mathematically constrained to preserve smoothness, have been proposed to address this issue. Despite this, in many physical systems, such as diffusion processes, smoothness naturally increases and unitarity may be overconstraining. In this paper, we systematically study the smoothing effects of different GNNs for dynamics modeling and prove that unitary convolutions hurt performance for such tasks. We propose relaxed unitary convolutions that balance smoothness preservation with the natural smoothing required for physical systems. We also generalize unitary and relaxed unitary convolutions from graphs to meshes. In experiments on PDEs such as the heat and wave equations over complex meshes and on weather forecasting, we find that our method outperforms several strong baselines, including mesh-aware transformers and equivariant neural networks.

08.
Science (Express) 2026-05-21

Nodeless superconducting gap and electron-boson coupling in (La,Pr,Sm)3Ni2O7 films | Science

Authors: Unknown Author

The discovery of superconductivity in Ruddlesden-Popper (RP) bilayer nickelate films under ambient pressure provides an opportunity to directly investigate electronic energy scales of the superconducting state and the pairing mechanism. We report angle-resolved photoemission spectroscopy measurements of superconducting (La,Pr,Sm) 3 Ni 2 O 7 thin films by developing an ultra-high vacuum cryogenic sample quenching and transfer technique. A superconducting gap of ~18 meV with coherence peaks is observed along the Brillouin zone diagonal. The finite gap persists across the entire Brillouin zone, revealing the absence of gap nodes. A kink is observed in the energy-momentum dispersion at ~70 meV below Fermi level, indicating an electron-boson coupling. The simultaneous observation of a nodeless superconducting gap and electron-boson coupling provides insight into the pairing symmetry and gluing mechanism in RP bilayer nickelates.

09.
arXiv (CS.AI) 2026-06-19

Efficient and Sound Probabilistic Verification for AI Agents

arXiv:2606.20510v1 Announce Type: cross Abstract: Securing AI agents that operate in complex digital environments has become a critical need, and runtime monitoring approaches that formulate and enforce policies expressed in a formal language like Datalog offer a promising solution. However, existing approaches are restricted to deterministic policies. In many practical applications of AI agents, there is a need to enforce security policies in the face of ambiguity, leading to probabilistic predicates or state transitions (for example, a declassifier or Personally Identifiable Information (PII) detector that has some failure probability on each invocation). Furthermore, in many such applications, one cannot easily make the independence assumptions necessary to invoke prior work on probabilistic inference in Datalog. We address this by introducing a sound and efficient framework for such verification based on distributionally robust optimization, computing sound upper bounds on the probability of policy violation regardless of possible correlations between predicates. On standard benchmarks for terminal and tool calling agents, we demonstrate that our approach outperforms prior art and improves the security-utility trade-off while ensuring rigorous bounds on the probability of policy violation.

10.
arXiv (CS.CL) 2026-06-12

Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

Latent chain-of-thought compresses reasoning by replacing visible reasoning traces with continuous hidden-state recurrence, but existing formulations are difficult to optimize with standard on-policy reinforcement learning (RL) and hard to interpret causally. Our key insight is that a single pair of explicit boundary tokens can address both issues at once: discrete entry and exit anchors make the latent block compatible with standard on-policy RL, and the same anchors offer a natural foothold for mechanistic analysis. Motivated by this, we propose SWITCH, a switchable latent reasoning framework. The model emits to enter latent mode and to exit. Because the boundaries are ordinary discrete tokens, the GRPO policy ratio is well-defined at every decision point. The same anchors also expose the latent steps to direct probing and causal intervention. We train the model with a visible-to-latent curriculum and a Switch-GRPO objective that propagates gradients through recurrent latent computation. SWITCH consistently outperforms prior hidden-state-recurrence latent reasoning approaches at similar scale. Mechanistic analysis through the boundary tokens further reveals three findings: (i) is a sharply localised, learned switching policy rather than a stylistic artefact; (ii) the latent step it opens performs problem-specific, causally important computation rather than acting as an inert placeholder; and (iii) that computation is concentrated at a single hidden-state transition on entry. Together, these results show that hidden-state-recurrence latent reasoning is both RL-trainable and open to direct mechanistic analysis, including of how on-policy RL itself improves the model from the inside.

11.
arXiv (CS.LG) 2026-06-19

Understanding Key Features of Time Series Foundation Models from Epidemic Forecasting

arXiv:2606.19560v1 Announce Type: new Abstract: Seasonal influenza infects millions of people and causes substantial morbidity and mortality in the United States each year, making accurate short-term forecasting a core public-health need. Reliable forecasts of epidemic time series can inform vaccination timing, hospital staffing, and resource allocation, yet the comparative behavior of modern forecasting architectures on infectious-disease surveillance data remains insufficiently characterized. We address this gap through a systematic evaluation of regional influenza forecasting using influenza-like illness surveillance and influenza-associated hospitalization time series under both temporal and spatial generalization settings for 1-4-week-ahead prediction. We compare classical neural network architectures, numerical transformer-based models, pretrained time series foundation models, and LLM-based forecasting approaches. Across tasks, we demonstrate that a mixture-of-experts model that fuses multiple pretrained forecasters achieves the strongest overall performance, indicating that heterogeneous pretrained representations provide complementary predictive information. Our results further show that numerical transformer-based models produce reliable forecasts, while pretraining provides the largest gains at longer horizons, particularly when the pretraining domain is mechanistically aligned with influenza dynamics. In contrast, LLM-based time series methods underperform relative to numerical forecasters in this setting. Finally, we examine hospitalization information as both an auxiliary covariate and a pretraining source. Hospitalization signals provide complementary improvements in selected settings and clarify when additional surveillance streams enhance the robustness of multi-horizon forecasting. These findings provide actionable guidance on model selection, pretraining strategy, and auxiliary-signal use for influenza preparedness.

12.
arXiv (CS.LG) 2026-06-15

IntSeqBERT: Learning Arithmetic Structure in OEIS via Modulo-Spectrum Embeddings

arXiv:2603.05556v2 Announce Type: replace Abstract: Integer sequences in the OEIS span values from single-digit constants to astronomical factorials and exponentials, making prediction challenging for standard tokenised models that cannot handle out-of-vocabulary values or exploit periodic arithmetic structure. We present IntSeqBERT, a dual-stream Transformer encoder for masked integer-sequence modelling on OEIS. Each sequence element is encoded along two complementary axes: a continuous log-scale magnitude embedding and sin/cos modulo embeddings for 100 residues (moduli $2$–$101$), fused via FiLM. Three prediction heads (magnitude regression, sign classification, and modulo prediction for 100 moduli) are trained jointly on 274,705 OEIS sequences. At the Large scale (91.5M parameters), IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy (MMA) on the test set, outperforming a standard tokenised Transformer baseline by $+8.9$ pt and $+4.5$ pt, respectively. An ablation removing the modulo stream confirms it accounts for $+15.2$ pt of the MMA gain and contributes an additional $+6.2$ pt to magnitude accuracy. A probabilistic Chinese Remainder Theorem (CRT)-based Solver converts the model's predictions into concrete integers, yielding a 7.4-fold improvement in next-term prediction over the tokenised-Transformer baseline (Top-1: 19.09% vs. 2.59%). Modulo spectrum analysis reveals a strong negative correlation between Normalised Information Gain (NIG) and Euler's totient ratio $\varphi(m)/m$ ($r = -0.851$, $p < 10^{-28}$), providing empirical evidence that composite moduli capture OEIS arithmetic structure more efficiently via CRT aggregation.

13.
arXiv (CS.AI) 2026-06-19

Measuring Biological Capabilities and Risks of AI Agents

arXiv:2606.19899v1 Announce Type: cross Abstract: This paper addresses a rapidly emerging policy challenge: how to generate and interpret credible evidence about the biological capabilities and risks of AI scientists, or agentic AI systems capable of autonomously or collaboratively performing multi-step scientific tasks. As these systems enter real research workflows, decision-makers increasingly face evaluation results whose meaning depends on underlying design choices that are often implicit or under-documented. We synthesize current evidence on AI-enabled biological risks and introduce biological agentic evaluations as a promising, but interpretation-sensitive, tool for assessing these systems. Our central contribution is a set of practical, experience-grounded considerations – drawing from our own evaluations – that show how choices around defining, designing, running, scoring, and documenting evaluations materially shape what results do and do not imply about risk. The analysis is intended to help policymakers interpret biological evaluation outputs with appropriate caution; guide public and private funders toward high-leverage investments in AI-biology evaluation research; and support biosecurity practitioners assessing emerging AI systems. A secondary audience includes researchers designing or conducting agentic evaluations within frontier AI labs, AI providers, scientific institutions, and third-party evaluation organizations.

14.
arXiv (CS.AI) 2026-06-16

Mind-Studio: Executable World Models with Lookahead Evaluation for Partially Observable Games

arXiv:2606.16070v1 Announce Type: new Abstract: World-model synthesis aims to turn interaction experience into an internal model of environment dynamics. Existing symbolic approaches often fit observed transitions or mixtures of local rules, but they do not produce a complete executable program that can run independently of the real environment. We present Mind-Studio, a framework that synthesizes executable pygame-style world models from state-action-next-state trajectories using large language models. Mind-Studio combines entropy-selected traces with a lightweight game skill file containing object, action, and static scene information extracted from screenshots. We evaluate synthesis quality with a K-step lookahead fidelity protocol that compares generated world-model rollouts against Real-ALE rollouts from the same state. On Montezuma's Revenge, Mind-Studio improves chosen-action next-state prediction from 0.3% for PoE-World to 48.7% while verifying 5 of 8 subgoals; across Alien, Assault, and Skiing, it achieves stronger branch-level fidelity than prior learned lookahead sources.

15.
PLOS Computational Biology 2026-06-16

Evolution and the ultimatum game: An agent-based model with interbirth intervals and population structure

by Jeffrey C. Schank, Matt L. Miller The ultimatum game (UG) is widely used to study mutually beneficial exchanges, fairness, and prosocial behavior across different societies. However, human behavior in UG experiments does not align with the game-theoretical prediction that proposers should offer the least positive amount and responders should accept such offers. Instead, proposers make generous offers that are greater than the minimum responders are willing to accept, resulting in generous offers with wide offer-acceptance gaps. Numerous evolutionary models of the UG have been created and studied to explain human behavior, particularly generous offers made in UG experiments. These models have recently faced criticism for lacking biological realism and not adequately explaining the data. Here, we present an agent-based model inspired by our hunter-gatherer ancestors and with a biologically more realistic selection process. We assume that (1) agents exist in group-structured and group-clustered populations, where reproduction (2) depends on resource accumulation, but (3) is limited by interbirth intervals. We ran simulations to assess whether this biologically more realistic model evolves patterns of behavior consistent with patterns in the data from meta-analyses of human behavior in the UG. For the proposed model, we show that generous offers robustly evolve, as well as the difficult-to-explain offer-acceptance gaps, only in group-structured populations with interbirth intervals. We demonstrate that these results are robust and may help explain variation in data across societies. We discuss how interbirth intervals interact with group structure to modulate offer and rejection costs, favoring the evolution of generous offers, offer-acceptance gaps, and other patterns in the data on human behavior in the UG. We also discuss why weak selection and/or high mutation rate models cannot explain all the patterns in UG experimental data. We discuss biological realism and conclude that group structure and interbirth intervals may be essential for explaining prosocial behavior across societies.

16.
PLOS Computational Biology 2026-06-22

<i>HoloBio</i>: A holographic microscopy tool for quantitative biological analysis

Authors:

by Waira Mona, Maria J. Gil-Herrera, Emanuel Mazo, Daniel Córdoba, Sofia Obando-Vasquez, Maria J. Lopera, Rene Restrepo, Carlos Trujillo, Ana Doblas, Raul Castaneda Holographic imaging in microscopy enables label-free quantitative information of biological specimens and has found applications across a wide range of biomedical studies, from cell morphology to particle dynamics; yet its widespread adoption is often limited by the lack of accessible and standardized analysis software. We present HoloBio, an open-source, Python-based graphical user interface developed to address this issue. This software offers two primary operational modes: a Real-Time mode that enables live processing of holograms at video frame rates, and an Offline mode designed for post-processing previously recorded holograms. HoloBio is compatible with holograms recorded using both lens-based and lensless systems, supporting off-axis architectures in telecentric and non-telecentric configurations, as well as slightly off-axis and in-line optical setups. The software incorporates tools for cell tracking, phase profiling, thickness estimation, and morphological analysis, including cell counting and object area quantification. HoloBio is designed to be accessible for users without coding expertise, offering a reproducible, high-throughput environment tailored for researchers in biology, biophotonics, and biomedical imaging.

17.
arXiv (CS.AI) 2026-06-16

Towards Verifiable Agentic Data Science: Solving Irregular TSQA Via Tool-Grounded Reasoning

arXiv:2606.15107v1 Announce Type: new Abstract: Time series data in real-world deployments is overwhelmingly irregular. Observations are asynchronous, missing values are informative rather than random, and sampling frequencies vary across sensors and operational windows. However, existing Time Series Question Answering (TSQA) benchmarks mostly assume regularly sampled inputs, leaving a fundamental gap in understanding how large language models (LLMs) and AI agents perform under irregular conditions. To bridge this gap, we introduce IRTS-ToolBench, a benchmark of 1,700 questions spanning 10 task types across 13 domains. IRTS-ToolBench is designed to be used independently by any researcher working on LLM-based irregular time series analysis, providing standardized inputs and a reproducible evaluation protocol. Code can be found in https://github.com/SanhornC/IRTS-ToolBench.

18.
arXiv (CS.AI) 2026-06-17

Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training

arXiv:2604.18701v3 Announce Type: replace-cross Abstract: Local prediction-error-based curiosity rewards focus on the current transition without considering the world model's cumulative prediction error across all visited transitions. We introduce Curiosity-Critic, which grounds its intrinsic reward in the improvement of this cumulative objective, and show that it admits a tractable per-step surrogate: the difference between the current prediction error and the asymptotic error baseline of the current state transition. We estimate this error baseline online with a learned critic co-trained alongside the world model; since the critic only has to learn how hard a transition is to predict, its estimate of the irreducible noise floor converges well before the world model saturates, redirecting exploration toward learnable transitions. The reward is higher for learnable transitions and collapses toward zero for stochastic ones, thereby separating epistemic (reducible) from aleatoric (irreducible) prediction error online. Prior prediction-error curiosity formulations, from Schmidhuber (1991) to learned-feature-space variants, emerge as special cases corresponding to specific approximations of this error baseline. Experiments on a stochastic grid world show that Curiosity-Critic outperforms prediction-error, visitation-count, and Random Network Distillation methods in training speed and final world model accuracy.

19.
arXiv (CS.CV) 2026-06-11

DrivingAgent: Design and Scheduling Agents for Autonomous Driving Systems

Many autonomous driving systems are increasingly incorporating foundation models to improve generalization and handle long-tail scenarios. However, this trend introduces two key challenges: (i) the manual and labor-intensive process of designing and integrating new models, and (ii) the lack of intelligent, dynamic scheduling mechanisms to meet strict real-time constraints. While Large Language Model (LLM)-based agents offer a promising avenue for automation, existing frameworks are ill-suited for autonomous driving. Specifically, they fail to distinguish between the fundamentally different requirements of system design and real-time scheduling, treat modules as opaque black boxes, and are not designed for continuous operation. To address these limitations, we propose DrivingAgent, a novel agent framework tailored to the dual challenges of autonomous driving system design and scheduling. In the design phase, DrivingAgent automates module development by interpreting system architecture, generating code, and validating modules via super-network training. In the scheduling phase, it employs a lightweight LLM trained with reinforcement learning to dynamically orchestrate system modules in real time, supported by a structured memory that integrates long-term storage with timestamped short-term context. Experimental results demonstrate that DrivingAgent achieves a superior speed–accuracy trade-off on both the nuScenes and Bench2Drive benchmarks.

20.
arXiv (quant-ph) 2026-06-16

Hardy-type self-testing and exposedness of tripartite GHZ correlations

arXiv:2512.16242v2 Announce Type: replace Abstract: Nonlocality can be witnessed either through Bell-inequality violations or through logical contradictions such as Hardy's paradox. In the bipartite two input two outcome scenario, these two routes have distinct geometric behavior: CHSH-maximal correlations are exposed points of the quantum set, whereas known Hardy-type self-testing correlations on the no-signaling boundary are non-exposed. Here we show that this bipartite intuition fails in the tripartite two input two outcome scenario. We study the tripartite instance of a multipartite Hardy-type paradox and prove that the correlation attaining the maximal Hardy success probability self-tests the Greenberger–Horne–Zeilinger state and the associated measurements. Although this correlation lies on the no-signaling boundary, we show that it is an extremal and exposed point of the quantum correlation set. Moreover, it coincides with the correlation attaining the maximal violation of the Mermin inequality. Thus, in the tripartite GHZ scenario, the logical-paradox and Bell-inequality routes to nonlocality select the same exposed quantum boundary point. We also establish a robust version of the self-test, showing that small deviations from the ideal Hardy constraints imply quantitative closeness to the target state and measurements. Our results reveal a qualitative geometric difference between bipartite and tripartite Hardy-type nonlocality and suggest a broader investigation of exposedness for multipartite Hardy correlations in the multiparty setting.

21.
PLOS Computational Biology 2026-06-22

GrassSV – hybrid method to detect structural variants in high throughput DNA-seq data

by Dominik Witczak, Krzysztof Sychla, Julia Wysocka, Artur Laskowski, Wojciech Frohmberg, Marta Glowacka, Alicja Dzik, Piotr Lukasiak, Jacek Blazewicz, Aleksandra Swiercz Genetic diversity is crucial for populations to adapt and survive in dynamic environments. This diversity arises from genetic mutations, which manifest in the genome as structural variants (SVs). Several types of SVs exist, but not all are equally easy to detect. Current SV detection tools tend to specialize in certain SV types or require the use of multiple tools to obtain a comprehensive variant profile, which increases computational cost and complexity. While some methods excel at identifying breakpoints, they often struggle with accurately classifying variant types, and their precision depends strongly on data quality and sequencing technology. At present, the majority of available genomic data originates from high-quality short reads, which remain the most affordable sequencing technology. In this manuscript, we introduce GrassSV, a novel and computationally efficient method that employs a hybrid pattern-matching approach to detect all major classes of structural variants using short-read sequencing data. GrassSV integrates depth-of-coverage analysis with contig-based pattern recognition to ensure both sensitivity and precision while minimizing false positives and runtime. Its robustness was demonstrated on the human Genome in a Bottle dataset, as well as on synthetic data derived from the yeast genome, where it achieved high accuracy across all SV types at a lower computational cost compared to existing methods. This makes GrassSV a practical alternative to multi-tool pipelines typically required for comprehensive SV detection. GrassSV is available at https://github.com/Domomod/GrassSV under GPL-3.0 license and the benchmark at: https://github.com/Domomod/GrassBenchmark.

22.
arXiv (CS.LG) 2026-06-16

Robust Neural Tucker Factorization with Bias Correction and Adaptive Initialization

arXiv:2606.16388v1 Announce Type: new Abstract: High-dimensional incomplete (HDI) tensors are widely used in traffic and climate applications, but sparse observations make accurate completion difficult. The intrinsic non-linear dynamics and non-stationary variations across distinct multi-modal fields severely hinder the efficacy of conventional linear reconstruction frameworks. Neural Tucker factorization provides an effective framework for modeling high-order interactions among tensor modes. By parameterizing underlying structural characteristics into continuous latent spaces, neural representations circumvent the rigid low-rank constraints of classical algebra. However, its performance can still be affected by implementation-level choices, especially parameter initialization and the bias configuration of the final output mapping. Suboptimal initializations frequently lead to variance explosion across the cubically expanded interaction spaces, driving the subsequent non-linear activation boundaries into severe gradient saturation zones, while the omission of a dedicated translation parameter forces interaction weights to implicitly absorb global statistical deviations. This paper proposes a simple yet effective neural Tucker factorization model with Kaiming initialization and bias correction (KaBiN) for HDI tensor completion. The proposed model utilizes Kaiming uniform initialization for the embedding and Tucker linear parameters, and adopts a simple bias correction in output mapping. By elegantly decoupling global mean shifts from local structural representations, the framework provides a highly stable and well-conditioned optimization landscape. Experiments on three real-world HDI tensor datasets show that KaBiN achieves better performance than the original NeuTucF, while introducing minimal computational overhead.

23.
arXiv (quant-ph) 2026-06-15

Path superposition activating perfect quantum teleportation ability for separable states

arXiv:2505.11398v2 Announce Type: replace Abstract: Quantum teleportation is a quintessential quantum communication protocol that enables the transmission of an arbitrary quantum state between two distant parties without physically transmitting the state with the help of shared entanglement and limited classical communication. We show that it is possible to relax the entanglement requirement in quantum teleportation if we have access to a certain strain of superposition of quantum processes. Two types of superposition of quantum processes are generally considered in the literature: superposition of paths identified with quantum maps and superposition of indefinite causal orders of the maps. We find that when superposition of paths is incorporated in the protocol, quantum teleportation with unit fidelity becomes possible with nonzero probability of 1/4 even when the two parties share certain classes of separable states, including pure product states. In contrast, the assistance of superposition of indefinite causal order of quantum maps in teleportation protocol does not enable any quantum advantage for shared pure product states. Furthermore, we show that separable Werner states can also yield quantum advantage in quantum teleportation assisted by the superposition of paths. Finally, we establish that the presence of quantum coherence in the control qubit is both necessary and sufficient to achieve quantum advantage in quantum teleportation assisted with superposition of paths. The results potentially uncover yet another role of quantum superposition, in general, in teleportation versus entanglement.

24.
arXiv (CS.AI) 2026-06-17

Probing, Fusion, and Trustworthiness: A Systematic Evaluation of Foundation Model Representations for Multimodal Cancer Analysis

arXiv:2606.17115v1 Announce Type: cross Abstract: Foundation models (FMs) have emerged as powerful representation extractors for medical data, yet their generalizability to datasets under distribution shift remains underexplored. This work systematically evaluates FM-based representations on a suite of computational pathology tasks across two real-world commercial cohorts, IH-BC and IH-NSCLC, drawn from the licensed in-house (IH) oncology dataset. The analysis focuses on two modalities, whole-slide images and transcriptomic profiles, drawn from the IH multimodal data. We first benchmark unimodal probing performance across five FMs on eight downstream classification tasks, and find that image and omics representations carry complementary predictive signals. Then we investigate whether multimodal fusion can yield additional gains over unimodal baselines by comparing three image-omics fusion strategies built on paired representations. The trustworthiness of selected unimodal and multimodal pipelines is further assessed through conformal prediction. Our results show that FM representations achieve competitive performance on out-of-distribution data and that multimodal fusion helps mainly when no single modality dominates the signal. Conformal prediction reveals that in the majority of cases where a point prediction fails, the true diagnosis remains recoverable within the prediction set, reinforcing the value of uncertainty-aware inference for clinical support.

25.
arXiv (CS.AI) 2026-06-16

Discovering Symmetry Groups with Flow Matching

arXiv:2512.20043v3 Announce Type: replace Abstract: Symmetry is fundamental to understanding physical systems and can improve performance and sample efficiency in machine learning. Both pursuits require knowledge of the underlying symmetries in data, yet discovering these symmetries automatically is challenging. We propose LieFlow, a novel framework that reframes symmetry discovery as a distribution learning problem on Lie groups. Instead of searching for the symmetry generators, our approach operates directly in group space, modeling a symmetry distribution over a large hypothesis group $G$. The support of the learned distribution reveals the underlying symmetry group $H \subseteq G$. Unlike previous works, LieFlow can discover both continuous and discrete symmetries within a unified framework, without assuming a fixed Lie algebra basis or a specific distribution over the group elements. Experiments on synthetic 2D and 3D point clouds, ModelNet10 and a real-world MI-Motion dataset show that LieFlow accurately discovers continuous and discrete subgroups, significantly outperforming a state-of-the-art baseline, LieGAN, in identifying discrete symmetries.