Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (quant-ph) 2026-06-24

Picosecond Schrödinger cat states for ultrafast optical quantum processing

arXiv:2606.24002v1 Announce Type: new Abstract: Non-Gaussian states are essential resources for universal, fault-tolerant optical quantum computing, but their generation rate remains limited by low heralding probabilities and operation in nanosecond temporal modes. Here, we demonstrate multi-photon generalized photon subtraction in picosecond optical wave packets, establishing the state-generation capability required for high-rate operation by addressing the temporal-mode bottleneck that has constrained the achievable rate. Two interfering ultrashort squeezed vacua are heralded by photon-number-resolving detection with a high-speed transition-edge sensor and characterized by pulsed homodyne detection matched to 10-ps temporal modes at a 5-MHz pump repetition rate. We reconstruct Wigner functions without loss correction that exhibit up to four distinct negative regions for four-photon heralding, together with an effective cat-state amplitude of $\alpha_{\mathrm{eff}} = 1.69$. This amplitude approaches the range of practical relevance for fault-tolerant cat-code architectures and for adaptive breeding toward logical-qubit generation, while the picosecond temporal mode establishes a platform compatible with high-rate, scalable time-multiplexed photonic architectures.

02.
arXiv (CS.CL) 2026-06-17

Securing Multi-Agent GIS Systems: Risk Evaluation and Prompt Hardening Optimization

Agentic systems are increasingly integrated with geographic information systems (GIS), where multi-agent coordination enables complex conversational and spatial analysis but introduces security risks. This work presents a security-oriented framework for risk identification, evaluation, and mitigation in a multi-agent GIS system while maintaining adaptability to broader agentic architectures. We test the agentic system of a commercial geospatial partner while developing a modular state-machine-based orchestration framework that abstracts agent behavior into reusable components. We evaluate robustness using a red-teaming framework with an adaptive attacker LLM and a deterministic judge that produces binary outcomes with supporting rationales across multi-turn attacks. We further improve resilience with a prompt optimization framework that treats prompts as structured signatures and injects adversarial demonstrations, enabling systematic security improvements without degrading task performance.

03.
arXiv (CS.CL) 2026-06-24

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

We introduce NatureBench, a cross-discipline benchmark of 90 tasks distilled from peer-reviewed Nature-family publications, designed to evaluate whether AI coding agents can move beyond reproduction toward discovery on real scientific problems. NatureBench is built on NatureGym, an automated pipeline that constructs a standardized, per-task containerized environment from a source paper, addressing the environment-fragmentation problem that has limited the credibility of prior agent-on-research benchmarks. Evaluating ten frontier agent configurations under a strict web-search-disabled protocol, we find that the strongest model surpasses SOTA on only 17.8% of tasks under the g>0.1 criterion. Analysis of method pathways reveals that agents succeed primarily through methodological translation, converting scientific tasks into familiar supervised prediction problems, rather than through genuine scientific invention. Failures are dominated by wrong method choice and insufficient compute budget, not by task misunderstanding. We release the benchmark, the NatureGym pipeline, and a public leaderboard with maintainer-side reproduction. Code: https://github.com/FrontisAI/NatureBench

04.
arXiv (CS.LG) 2026-06-16

Characterizing Admissible Objective Functions for Hierarchical Clustering

arXiv:2604.23628v2 Announce Type: replace-cross Abstract: Hierarchical clustering is a fundamental task in data analysis, but classical methods have long lacked a principled objective function. Dasgupta [STOC~2016] took an important step toward addressing this gap by proposing a well-motivated objective function for cluster trees. Cohen-Addad et al. [J. ACM 2019] subsequently introduced the notion of admissibility: an objective function is admissible if, whenever the input similarity matrix admits generating trees, its minimizers are precisely those generating trees.They also gave a necessary and sufficient condition for admissibility within a family of objective functions based on aggregate intercluster similarity. We refer to this family as sum-type objective functions. However, apart from Dasgupta's original objective function, no explicit admissible objective functions in this family were provided. In this paper, we study admissible objective functions for hierarchical clustering in two directions. For sum-type objective functions, we give a complete characterization when the scaling function is a symmetric polynomial of degree at most two, and we derive sufficient conditions for degree-three polynomials. We also show that the recursive sparsest cut algorithm achieves an O$(\phi)$-approximation ratio for the admissible objective functions covered by our characterization, where $\phi$ is the approximation factor of the sparsest cut subroutine. We then introduce max-type objective functions, where cluster interaction is measured by maximum, rather than aggregate, intercluster similarity. For this class, we characterize which objective functions are admissible for arbitrary symmetric scaling functions and give a complete characterization when the scaling function is a symmetric polynomial of degree at most two.

05.
arXiv (CS.CV) 2026-06-25

Latent Space Analysis for Interpretable Uncertainty in Melanoma Classification

Melanoma is a highly aggressive skin cancer, making early and accurate diagnosis critical. While deep learning excels in skin lesion classification, standard ``black-box" models struggle to explain diagnostic uncertainty, limiting clinical trust. This work introduces a hybrid framework combining a class-aware adversarial Variational Autoencoder and an XGBoost classifier, transcending simple binary classification by leveraging a generative latent space for interpretable decision support. Guided by adversarial training, the model learns the visual characteristics of skin lesions and projects them into a continuous latent space, ensuring that similar images are grouped closely together. Trained on this latent space, the XGBoost classifier achieves a robust AUC of 0.868, competing closely with state-of-the-art models. For borderline cases, the framework enables clinicians to leverage the latent topology through Content-Based Image Retrieval. This provides a dual benefit: it allows the clinician to visually compare an ambiguous lesion against biopsy-confirmed precedents and acts as an early warning sign since a borderline classification can indicate that a lesion shares features of both nevi and melanomas, potentially requiring close monitoring. Our approach translates algorithmic hesitation into transparent, evidence-based visual support, bridging the gap between predictive performance and clinical trust.

06.
arXiv (math.PR) 2026-06-11

Capital Asset Pricing Model with Size Factor and Normalizing by Volatility Index

arXiv:2411.19444v5 Announce Type: replace-cross Abstract: The Capital Asset Pricing Model (CAPM) relates a well-diversified stock portfolio to a benchmark portfolio. We insert size effect in CAPM, capturing the observation that small stocks have higher risk and return than large stocks, on average. For some size-based stock portfolios, dividing their returns by the Volatility Index makes them closer to independent and normal. In this article, we combine these ideas to create a new discrete-time model, which includes volatility, relative size, and CAPM. We fit this model using real-world data, prove the long-term stability, and connect this research to Stochastic Portfolio Theory. We fill important gaps in our previous article on CAPM with the size factor.

07.
arXiv (quant-ph) 2026-06-25

Exact Leg-Cut Influence Functional and Emergence of Gaussian Entanglement Theory in a Statistical-Dressing Ladder Model

arXiv:2606.25669v1 Announce Type: new Abstract: The emergence of Gaussian effective field theories in low-dimensional quantum systems is traditionally understood through top-down frameworks such as bosonization and Luttinger-liquid theory. However, these approaches typically focus on the long-wavelength degrees of freedom in ways that do not directly track how non-Gaussian lattice-scale correlations are progressively discarded under coarse-graining. In this work, we present an exact lattice formulation from which this phenomenon emerges analytically. We analyze a two-leg hard-core ladder under a leg bipartition, where non-local statistical strings cross the entanglement cut. We construct an exact lattice influence-functional representation showing that the reduced state factorizes strictly into a product-state amplitude and a full-counting-statistics functional. By introducing a commuting linked-cluster superoperator hierarchy that bypasses Baker-Campbell-Hausdorff ordering ambiguities, we prove that the first mixedness-generating sector is strictly density-density in character. Under a specific systematic coarse-graining procedure, we analytically derive the suppression of higher-order corrections, providing a controlled, closed-form framework showing how highly non-Gaussian lattice states evolve toward their quadratic continuum form under coarse-graining. We corroborate these analytical predictions through finite-size exact diagonalization and entanglement-spectrum diagnostics.

08.
arXiv (math.PR) 2026-06-19

Critical parameters of germ-monotone families of branching random walks

arXiv:2602.21062v2 Announce Type: replace Abstract: We introduce a broad class of families of branching random walks on a countable set $X$, which we refer to as germ-monotone branching random walks (GMBRWs). The processes in each family are parametrized by a positive parameter $\lambda>0$, which controls the overall reproductive speed, and they are monotonically increasing in $\lambda$ with respect to the germ order, a notion that extends classical stochastic domination. This framework encompasses a wide range of models, including classical continuous-time branching random walks, as well as discrete-time counterparts of certain non-Markovian processes such as ageing branching random walks. We define a general notion of critical parameter $\lambda(A)$ associated with each subset $A \subseteq X$, which serves as a threshold separating almost sure extinction in $A$ from positive probability of survival in $A$. This unifies and extends the classical global and local critical parameters $\lambda_w$ and $\lambda_s$, which can be recovered as special cases. We then investigate how modifications of the reproduction laws, either on a finite set or on a more general subset of $X$, affect these critical parameters. Our results extend earlier contributions in the literature.

09.
arXiv (CS.AI) 2026-06-24

FLUX3D: High-Fidelity 3D Gaussian Generation with Diffusion-Aligned Sparse Representation

arXiv:2606.24874v1 Announce Type: cross Abstract: Sparse voxel representation has emerged as a scalable foundation for image-to-3D Gaussian Splatting (3DGS) generation, yet current methods struggle to preserve high-frequency visual details of input images due to two structural bottlenecks. First, they adopt discriminative 2D features optimized for semantic abstraction to construct sparse voxel latents, which suppress reconstructive cues and induce a representation bottleneck. Second, in the generation stage, standard diffusion transformers lack effective mechanisms to align dense 2D image tokens with sparse 3D voxel latents, resulting in a cross-modal correspondence bottleneck. To address these issues, we propose FLUX3D, a scalable image-to-3DGS framework that boosts both representation learning and cross-modal alignment during generation. We first revisit 2D feature selection for sparse-voxel-based 3D representation learning, propose Diffusion-Aligned Structured Latents (DA-SLAT) and couple it with a decoder-only architecture to improve 3DGS reconstruction fidelity. We also design a sparse-structure-aware diffusion framework, which integrates the Sparse-structure Multimodal Diffusion Transformer (SMDiT) and Modal-Aware Rotary Positional Embedding (MARoPE) to achieve geometry-agnostic 2D-3D alignment. Extensive benchmark experiments demonstrate that FLUX3D yields substantial improvements in appearance fidelity and significantly outperforms all state-of-the-art (SOTA) methods in generating high-quality 3DGS assets.

10.
arXiv (CS.LG) 2026-06-12

Limits of spectral learning under noise

arXiv:2606.13067v1 Announce Type: new Abstract: Learning functional relationships from noisy data is a central problem in scientific inference. Spectral methods approximate unknown functions by expanding them in a basis and estimating the corresponding coefficients from data, but the stability of these coefficients under noise remains poorly understood. Here we study supervised regression with additive label noise using sparse spectral representations across multiple bases and dimensions. We show that noise induces a predictable drift in the learned coefficient vector whose magnitude depends on the effective number of active spectral modes. After whitening the empirical feature geometry, we derive a closed-form expression for the overlap between noisy and noiseless coefficient vectors, revealing a universal degradation curve governed by a single intrinsic noise scale. Numerical experiments across Fourier, Legendre, Bessel, and Haar bases confirm the theoretical prediction. The results demonstrate that spectral learning exhibits a fundamental noise threshold beyond which coefficient estimates become unstable, placing intrinsic limits on recovering functional structure from noisy data.

11.
arXiv (CS.AI) 2026-06-12

AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility

arXiv:2606.13608v1 Announce Type: new Abstract: Agent systems are advancing quickly across domains, but their evaluation remains fragmented. Most benchmarks rely on fixed, LLM-centric harnesses that require heavy integration, create test-production mismatch, and limit fair comparison across diverse agent designs. The root problem is the lack of an open, agent-agnostic assessment interface. We advocate Agentified Agent Assessment (AAA), where evaluation is performed by judge agents and all participants interact through standardized protocols: A2A for task management and MCP for tool access. Conventional benchmarking defines two separate interfaces, one for the benchmark and one for the agent, while AAA only needs one; this yields a generic, unified framework that separates assessment logic from agent implementation and enables reproducible, interoperable, and multi-agent evaluation. We further introduce AgentBeats as a concrete realization of AAA: we identify five practical operation modes that make standardized assessment compatible with real-world constraints on openness, privacy, and reproducibility. To evaluate our design at scale, we conduct two studies: a five-month open competition that drew 298 judge agents across 12 categories together with 467 subject agents from independent participants, showing that AAA applies across a heterogeneous range of benchmarks; and a case study on coding agents that confirms agentified evaluation preserves fidelity with the public record while surfacing previously missing head-to-head results, yielding research insights about agent design. Combining a community-scale field study and a controlled coding case study, we verify that AAA delivers coverage, practicality, and fidelity across heterogeneous scenarios at scale. Together, AAA and AgentBeats offer a clear path toward open, standardized, and reproducible agent assessment.

12.
arXiv (CS.AI) 2026-06-19

SoftSkill: Behavioral Compression for Contextual Adaptation

arXiv:2606.20333v1 Announce Type: new Abstract: Agent skills are commonly deployed as natural-language Markdown files that encode answer policies, evidence-use habits, and task procedures. These files are readable and portable, but they are consumed indirectly: for each task instance, a frozen language model must translate a long textual artifact into generation-time behavior. This paper asks whether a natural-language skill can instead initialize a compact continuous context object, refined by a trainable soft delta while the base model remains frozen. We propose SoftSkill, a frozen-backbone method that tunes such soft skills with next-token prediction and deploys them as latent behavioral priors at inference time. In our main single-round setting, a length-32 SoftSkill prefix on Qwen3.5-4B improves over no-skill prompting by 8.3 points on SearchQA, 42.1 points on LiveMath, and 1.3 points on DocVQA. Relative to SkillOpt, SoftSkill improves accuracy by 5.2 points on SearchQA and 12.5 points on LiveMath, while replacing hundreds to thousands of Markdown skill tokens with a few virtual tokens. We further study agentic execution as a harder boundary case, where sparse trajectory imitation provides useful signal but does not yet robustly compress long-horizon procedural behavior. More broadly, the results suggest that some task skills are better treated not as additional Markdown to be reinterpreted at inference time, but as compact latent controls over how a frozen model enters the task.

13.
arXiv (CS.CV) 2026-06-16

Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization

Domain Generalization (DG) seeks to develop a versatile model capable of performing effectively on unseen target domains. Notably, recent advances in pre-trained Visual Foundation Models (VFMs), such as CLIP, have demonstrated considerable potential in enhancing the generalization capabilities of deep learning models. Despite the increasing attention toward VFM-based domain prompt tuning within DG, the effective design of prompts capable of disentangling invariant features across diverse domains remains a critical challenge. In this paper, we propose addressing this challenge by leveraging the controllable and flexible language prompt of the VFM. Noting that the text modality of VFMs is naturally easier to disentangle, we introduce a novel framework for text feature-guided visual prompt tuning. This framework first automatically disentangles the text prompt using a large language model (LLM) and then learns domain-invariant visual representation guided by the disentangled text feature. However, relying solely on language to guide visual feature disentanglement has limitations, as visual features can sometimes be too complex or nuanced to be fully captured by descriptive text. To address this, we introduce Worst Explicit Representation Alignment (WERA), which extends text-guided visual prompts by incorporating an additional set of abstract prompts. These prompts enhance source domain diversity through stylized image augmentations, while alignment constraints ensure that visual representations remain consistent across both the original and augmented distributions. Experiments conducted on major DG datasets, including PACS, VLCS, OfficeHome, DomainNet, and TerraInc, demonstrate that our proposed method outperforms state-of-the-art DG methods.

14.
arXiv (CS.AI) 2026-06-25

PVF:Understanding AI Vulnerability Against SDCs

arXiv:2405.01741v4 Announce Type: replace-cross Abstract: Reliability of AI systems is a fundamental concern for the successful deployment and widespread adoption of AI technologies. Unfortunately, the escalating complexity and heterogeneity of AI hardware systems make them increasingly susceptible to hardware faults, e.g., silent data corruptions (SDC), that can potentially corrupt model parameters. When this occurs during AI inference/servicing, it can potentially lead to incorrect or degraded model output for users, ultimately affecting the quality and reliability of AI services. In light of the escalating threat, it is crucial to address key questions: How vulnerable are AI models to parameter corruptions, and how do different components (such as modules, layers) of the models exhibit varying vulnerabilities to parameter corruptions? To systematically address this question, we propose a novel quantitative metric, Parameter Vulnerability Factor (PVF), inspired by architectural vulnerability factor (AVF) in computer architecture community, aiming to standardize the quantification of AI model vulnerability against parameter corruptions. We define a model parameter's PVF as the probability that a corruption in that particular model parameter will result in an incorrect output. In this paper, we present several use cases on applying PVF to three types of tasks/models during inference – recommendation (DLRM), vision classification (CNN), and text classification (BERT), while presenting an in-depth vulnerability analysis on DLRM. PVF has been a critical metric used for making key error management design decisions in productionizing Meta's in-house AI chip - MTIA.

15.
medRxiv (Medicine) 2026-06-12

Genome-wide association and multi-omics functional screens reveal the genetic architecture of foveal development

Foveal hypoplasia causes visual impairment across congenital eye disorders, yet the genetic programmes governing foveal development remain poorly characterised and no tractable model exists for foveal disease. In the first genome-wide association study of foveal hypoplasia, we identified 42 sentinel variants mapping to 54 effector genes supported by >= 2 criteria from a variant-to-gene framework incorporating developmental multi-omics. Disruption of six effector genes using mutant lines and CRISPR knockouts in the zebrafish high acuity zone recapitulates structural, functional, and ultrastructural hallmarks of foveal hypoplasia, establishing the first vertebrate disease model. Integration with human foetal single-cell and spatial transcriptomics reveals two temporal waves of effector gene expression and identifies Muller glia as critical mediators of foveal patterning. Phenome-wide analyses reveal foveal variants are pleiotropic with refractive, lenticular, and metabolic traits, connecting foveal development to anterior segment and systemic disease biology. These findings should inform mechanistic studies of macular disease.

16.
arXiv (CS.CV) 2026-06-15

Catching magnetic resonance imaging outliers in artificial intelligence-supported radiotherapy workflows: unsupervised detection and localization of image anomalies using deep learning

Artificial intelligence is increasingly integrated into radiotherapy workflows, yet such pipelines remain vulnerable to out-of-distribution image data that may introduce unexpected behavior in clinical tasks. Deep learning-based anomaly detection for pelvic magnetic resonance imaging (MRI) remains largely unexplored, and transparent evaluation of its feasibility for full automation is limited. We developed and evaluated a fully automated, unsupervised anomaly-detection framework for pelvic and brain MRI. A two-stage framework was trained on reference images from public datasets: LUND-PROBE for pelvic MRI, and IXI, fastMRI, and fastMRI+ for brain MRI. In the first stage, MRI slices were compressed into discrete tokens; in the second, the distribution of normal tokens was modeled. Anomaly evidence was estimated by combining perceptual image differences with token-surprisal scores based on negative log-likelihood. Automated detection was evaluated on pelvic MRI with synthetic global and real clinical anomalies, and on brain MRI with clinically annotated fastMRI+ abnormalities. Sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and false-positive behavior in held-out normal cases were assessed. The framework achieved robust detection across hidden evaluation cohorts, with AUCs of 0.97 (95% CI, 0.95-0.98) and 0.81 (95% CI, 0.74-0.87) for pelvic and brain MRI, respectively. Heatmap analysis showed strong spatial agreement between detected anomalies and ground-truth locations, supporting localization accuracy and interpretability. These results support the potential of unsupervised anomaly detection as an automated MRI quality-control layer for radiotherapy workflows, with transparent visualization of image regions likely to compromise downstream AI-based tasks.

17.
arXiv (quant-ph) 2026-06-16

Physically Motivated Ansatz for Open Fermionic Systems on Quantum Computer

arXiv:2606.16823v1 Announce Type: new Abstract: Determining non-equilibrium steady states (NESS) of open fermionic systems is a fundamental problem akin to finding ground states of closed systems. To address this, variational quantum algorithms can be used to solve the Lindblad master equation, much like the Schrödinger equation, yet ansatz design for NESS remains challenging. Existing approaches rely mostly on hardware-efficient ansätze (HEA), which suffer from the barren plateau problem. Here, we introduce a physically motivated ansatz named NE-UCC. Numerical simulations demonstrate that NE-UCC reliably converges to the steady state even in strongly correlated regimes far from equilibrium, reducing the infidelity by up to ten orders of magnitude compared to HEA. Furthermore, NE-UCC facilitates the exploration of excited eigenmodes with specific symmetries.

18.
arXiv (CS.LG) 2026-06-17

Learning Survival Models with Right-Censored Reporting Delays

arXiv:2510.04421v3 Announce Type: replace-cross Abstract: Survival analysis provides statistical methods to model the time until an event occurs. Reporting delays arise when event times are not observed at their occurrence but are only revealed upon reporting. This issue is particularly critical for timely risk evaluation when the observation window is short due to administrative censoring. In this study, we incorporate right-censored reporting delays by jointly modeling parametric hazards for the event and reporting processes. We then construct a consistent estimator for the model parameters and develop a Monte Carlo expectation-maximization algorithm to compute it. To address the challenges posed by administrative censoring, we leverage these findings and propose a transfer-learning procedure. Experimental results demonstrate that our method improves the accuracy of timely risk evaluation under administrative censoring.

19.
arXiv (CS.CV) 2026-06-18

DREAM: Extending Vision-Language Models with Dual-Objective Encoding for Cross-Modal Retrieval

In today's media-driven world, the exponential growth of video content across domains such as surveillance, education, and entertainment has made retrieving semantically relevant videos via natural language queries increasingly critical. Early video retrieval systems relied on handcrafted features or shallow cross-modal mappings, limiting their ability to capture complex semantics and temporal dynamics. While large-scale vision-language models have improved cross-modal alignment, challenges remain in modeling fine-grained temporal dependencies and nuanced linguistic structures. In this paper, we introduce DREAM: Dual-path Representation Enhancement and Alignment Model, a novel multimodal framework that addresses these limitations through enhanced visual and textual encoding. DREAM incorporates a hybrid language modeling strategy that combines masked and permuted language modeling objectives to capture both local and global linguistic semantics. On the visual side, we design a hierarchical vision encoder with cascaded group attention, which integrates spatial and temporal information through multi-stage token interaction and coarse-to-fine attention refinement. We validate DREAM through comprehensive evaluations on the widely-used MSRVTT, MSVD and LSMDC benchmark datasets, where it achieves new state-of-the-art R1 scores of 49.4%, 49.7% and 27.3%, respectively. Qualitative analyses further show the model's ability to maintain coherent attention across frames and align complex queries with dynamic video content. These findings underscore the effectiveness of hierarchical attention and dual-objective textual modeling in enabling robust, context-aware video retrieval, and pave the way for future research in advancing cross-modal representation learning.

20.
arXiv (CS.LG) 2026-06-25

Uncovering Insights of Compound Flooding with Data-Driven AI

arXiv:2506.04281v3 Announce Type: replace Abstract: Compound flooding, driven by nonlinear interactions between multiple hydrometeorological factors, poses a significant challenge to hazard prevention. Existing forecasting approaches, whether physics-based or data-driven, often emphasize temporal patterns while underexploring how multiple interacting factors jointly shape flood dynamics. To address this problem, we conduct a large-scale data-driven analysis of compound flooding in South Florida, a typical area for compound flooding, by integrating tidal conditions, rainfall, groundwater stage, and human water management activities. Our analysis reveals three key findings: (i) models that capture temporal dynamics alone fail to represent multi-factor interactions during compound events; (ii) subsurface saturation, as reflected by groundwater levels, emerges as a dominant predictor of flood severity, often outweighing immediate rainfall intensity in this porous coastal region; and (iii) the spatial state of surrounding monitoring stations within a finite effective radius provides critical causal context for flooding, while extending temporal history yields diminishing returns during extreme events. These findings suggest that compound flooding is governed more by spatially coupled system states than by long-term temporal dependencies, challenging rain-centric and sequence-dominated forecasting paradigms. By framing data-driven models as tools for scientific inquiry rather than prediction alone, this study offers new insights into the mechanisms of compound flooding and informs the design of more physically grounded early-warning systems for coastal environments. Our dataset and code are publicly available at https://github.com/AslanDing/SFBench.

21.
arXiv (CS.CV) 2026-06-25

VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction

Feed-forward 3D Gaussian Splatting (3DGS) has emerged as a highly effective solution for novel view synthesis. Existing methods predominantly rely on a pixel-aligned Gaussian prediction paradigm, where each 2D pixel is mapped to a 3D Gaussian. We rethink this widely adopted formulation and identify several inherent limitations: it renders the reconstructed 3D models heavily dependent on the number of input views, leads to view-biased density distributions, and introduces alignment errors, particularly when source views contain occlusions or low texture. To address these challenges, we introduce VolSplat, a new multi-view feed-forward paradigm that replaces pixel alignment with voxel-aligned Gaussians. By directly predicting Gaussians from a predicted 3D voxel grid, it overcomes pixel alignment's reliance on error-prone 2D feature matching, ensuring robust multi-view consistency. Furthermore, it enables adaptive control over density based on 3D scene complexity, yielding more faithful Gaussians, improved geometric consistency, and enhanced novel-view rendering quality. Experiments on widely used benchmarks demonstrate that VolSplat achieves state-of-the-art performance, while producing more plausible and view-consistent results. The video results, code and trained models are available on our project page: https://lhmd.top/volsplat.

22.
arXiv (CS.CV) 2026-06-25

Same Evidence, Different Answer: Auditing Order Sensitivity in Multimodal Large Language Models

Standard benchmarks for multimodal large language models (MLLMs) score each item on one canonical ordering and miss whether order-irrelevant shuffling changes the answer, a baseline reliability property called for by emerging AI evaluation guidelines. We introduce Facet-Probe, a five-facet audit (option, evidence-chunk, document-rank, image-set, and mixed-modality ordering) of 18 frontier and open-weight MLLMs. A Bayesian item-response model separates ordering noise from per-facet bias, and a same-ordering control estimates the decoder-stochastic floor for observed flips. We find that none of the 18 MLLMs we audit are order-invariant: screened per-facet panel-mean flip rates span 24-50%. A Gemini same-ordering control at temperature 0 estimates a substantial ordering excess over a same-input decoder-noise floor in verified cells. Capability predicts but does not eliminate flips; the best model still flips on 13.4% of trials. In our Gemini mitigation tests, training-free prompt changes are modality-conditional and do not transfer from text to visual reasoning. These results suggest that prompt-level mitigation alone is unlikely to provide general order robustness, motivating future work on training-time and architectural approaches. We propose cross-ordering flip rate as a standard reporting axis for MLLMs.

23.
PLOS Computational Biology 2026-06-22

Towards modeling phage therapy

by Rob J. de Boer, Robert Schooley, Alan S. Perelson Patients infected with life-threatening multi-drug resistant (MDR) bacteria have been treated with cocktails of bacteriophages. This is a complicated form of personalized medicine as the phages given to a patient have to be selected beforehand on the basis of their lytic capacity of the infecting bacteria. Because bacteria rapidly become resistant, the evolution of resistance to a diverse cocktail of phages is a complicated dynamical process, during which competing bacterial strains replace one another by accumulating several resistance mechanisms, each of which may involve a fitness cost. As a consequence, it is typically not known why a particular phage therapy succeeded or failed, and how one can optimize the composition of the cocktails to maximize the rate of success. To improve upon this, we extend an existing in vivo-calibrated mouse model into a novel mathematical model for the human situation, and include multiple phages infecting multiple bacterial strains, differing in their resistance to each of the phages. We adjust several parameter estimates of the bacterial model to the human situation, and use the model to describe a successful case of phage therapy involving several cocktails, each containing several phages. In the model, treatment success crucially depended on pretreatment resistance levels, and on the diversity and the timing of the cocktails. Once an appropriate cocktail is found, it is less important to further optimize the infection rates of the phages. Resistant bacterial strains expand rapidly when sensitive strains decline, and the higher the infectivity of the phages, the faster resistant strains expand. Because resistance evolves rapidly, it is best to provide a diverse set of phages right from the start of therapy, i.e., to hit hard and early, and create a high genetic barrier to bacterial resistance.

24.
arXiv (CS.AI) 2026-06-25

AI-Assisted Computational Reproducibility on the FABRIC Testbed

arXiv:2606.25879v1 Announce Type: cross Abstract: Computational reproducibility remains difficult despite being central to scientific research. In this paper, we show how the international FABRIC testbed, combined with large language model (LLM) coding assistants through LoomAI, can simplify reproducing published experiments across multiple domains. We reproduced three case studies on FABRIC, covering BBR-family congestion-control evaluations, LAMMPS molecular dynamics scaling benchmarks on a CPU-only MPI cluster, and stress protein homeostasis genomics pipelines. Rather than focusing only on matching numerical outputs, we evaluate whether the reproduced experiments support the same scientific conclusions as the original studies. The AI assistant was effective in setting up the environment, adapting code, and debugging, but struggled with the analysis stages that lacked clearly defined workflows, which required human guidance to establish execution order and data dependencies. Across the case studies, the AI-assisted workflow reduced reproduction effort by roughly 4–6 times. We conclude with practical recommendations for improving AI-assisted reproducibility on research testbeds.

25.
arXiv (math.PR) 2026-06-17

A note on the $\mathcal{W}_2$-convergence rate of the empirical measure of an ergodic $\mathbb{R}^d$-valued diffusion

arXiv:2502.07704v2 Announce Type: replace Abstract: In this note, we consider a Stochastic Differential Equation under a strong confluence and Lipschitz continuity assumption of the coefficients. For the unique stationary solution, we study the rate of convergence of its empirical measure toward the invariant probability measure. We provide rate for the Wasserstein distance in the mean quadratic and almost sure sense.