论文广场 - AcademicHub

01.

arXiv (CS.LG) 2026-06-19 DOI: arXiv:2606.20415

Pseudo-Feature Padding: A Lightweight Defense Against False Data Injection in Power Grids

作者:

Farhin Farhad Riya ↗Shahinul Hoque ↗Yingyuan Yang ↗Jinyuan Sun ↗Kevin Tomsovic ↗

arXiv:2606.20415v1 Announce Type: new Abstract: Deep Neural Networks DNNs have achieved remarkable accuracy in various tasks including their application in CyberPhysical Systems CPS for detecting False Data Injection Attacks FDIA during critical operations However the unique infrastructure of CPS makes DNNs vulnerable to exploitation by attackers aiming to evade detection Additionally the distinct nature of CPS presents challenges for conventional defense mechanisms against FDIA This paper proposes an innovative defense framework that strengthens DNNs against such attacks by introducing an additional input layer that performs padding in the input samples using pseudofeature values derived from the inputs statistical distribution This padding increases the input dimensionality in a randomized and dataaware manner making adversarial attacks computationally infeasible due to the nontransferable nature of crafted perturbations and the unpredictability of the padded structure Our method is lightweight modelagnostic and requires no modifications to the core architecture making it highly deployable in realworld CPS settings We evaluated our framework on critical power grid applications such as state estimation using the IEEE 14bus 30bus 118bus and 300bus systems Experiments under adversarial settings demonstrate that our padding strategy significantly improves model robustness with negligible impact on performance and effectively mitigates attacks that would otherwise bypass conventional defenses

阅读与讨论 → 访问原文 →

02.

arXiv (CS.LG) 2026-06-19 DOI: arXiv:2606.19799

The Hidden Environmental Cost of Poor Coding Practices in TensorFlow and Keras Applications: A Study on Resource Leaks and Carbon Emissions

作者:

Bashar Abdallah ↗Gustavo Santos ↗Rola Al Bataineh ↗Alain Abran ↗Mohammad Hamdaqa ↗

arXiv:2606.19799v1 Announce Type: cross Abstract: Efficiency and sustainability are critical considerations in the development and deployment of machine learning (ML) applications. Among the factors influencing sustainability, resource leaks in ML code can introduce hidden inefficiencies that elevate energy consumption and CO2 emissions. Despite this, empirical evidence quantifying their environmental impact remains limited. This emerging results paper presents an initial empirical investigation of two common resource-leak smells, namely Improper Model Reuse (IMR) and Unreleased Tensor References (UTR), and their impact on energy consumption and CO2 emissions in TensorFlow and Keras workloads. Controlled experiments were conducted for each smell by executing identical training tasks while comparing against a smell-free baseline. Our preliminary results show that both smells consistently increase estimated electricity usage and carbon emissions. IMR and UTR increased electricity consumption by approximately 32% and 46%, respectively, with proportional increases in CO2 emissions. Paired statistical tests indicate that these differences are systematic and statistically significant, providing initial empirical evidence that resource-leak smells may degrade ML energy efficiency and environmental sustainability. These findings suggest that resource-leak smells pose measurable risks to both software quality and sustainability, emphasizing the importance of integrating resource-lifecycle management and energy-efficiency considerations into ML development.

阅读与讨论 → 访问原文 →

03.

arXiv (CS.CV) 2026-06-16 DOI: arXiv:2603.03447

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

作者:

Weicai Yan ↗Yuhong Dai ↗Qi Ran ↗Haodong Li ↗Wang Lin ↗Tao Jin ↗Xing Xie ↗Hao Liao ↗Jianxun Lian ↗

Proactive and real-time interactive experiences are essential for human-like AI companions, yet face three key challenges: (1) achieving low-latency inference under continuous streaming inputs, (2) autonomously deciding when to respond, and (3) controlling both quality and quantity of generated content to meet real-time constraints. In this work, we instantiate AI companions through two gaming scenarios, commentator and guide, selected for their suitability for automatic evaluation. We introduce the Live Gaming Benchmark, a large-scale dataset with three representative scenarios: solo commentary, co-commentary, and user guidance, and present Proact-VL, a general framework that shapes multimodal language models into proactive, real-time interactive agents capable of human-like environment perception and interaction. Extensive experiments show Proact-VL achieves superior response latency and quality while maintaining strong video understanding capabilities, demonstrating its practicality for real-time interactive applications.

阅读与讨论 → 访问原文 →

04.

arXiv (quant-ph) 2026-06-24 DOI: arXiv:2603.27676

Resource theory of interactive quantum instruments

作者:

Chung-Yun Hsieh ↗Armin Tavakoli ↗Huan-Yu Ku ↗Paul Skrzypczyk ↗

arXiv:2603.27676v2 Announce Type: replace Abstract: Quantum instruments describe both the classical outcome and the updated quantum state in a measurement process. To do this in a non-trivial way, instruments must have the capability to interact coherently with the state that they measure. Here, we develop a resource theory for instruments. We consider a relevant quantifier of the separation between interactive and non-interactive instruments and show that it admits three distinct operational interpretations in terms of quantum information tasks. These concern (i) the preservation of maximally entangled states after a local measurement, (ii) the average ability to preserve random states after measurement, and (iii) the ability to recover the classical information generated from measuring half of a maximally entangled state. We also introduce a natural set of allowed operations and show that the third task fully characterises the resource content of instruments. Our general framework reproduces as special cases established resource theories for channels and measurements.

阅读与讨论 → 访问原文 →

05.

arXiv (CS.LG) 2026-06-11 DOI: arXiv:2603.15158

Point-Identification of a Robust Predictor Under Latent Shift with Imperfect Proxies

作者:

Zahra Rahiminasab ↗Reza Soumi ↗Arto Klami ↗Samuel Kaski ↗

arXiv:2603.15158v2 Announce Type: replace Abstract: Addressing the domain adaptation problem becomes more challenging when distribution shifts across domains stem from latent confounders that affect both covariates and outcomes. Existing proxy-based approaches that address latent shift rely on a strong completeness assumption to uniquely determine (point-identify) a robust predictor. Completeness requires that proxies have sufficient information about variations in latent confounders. For imperfect proxies the mapping from confounders to the space of proxy distributions is non-injective, and multiple latent confounder values can generate the same proxy distribution. This breaks the completeness assumption and observed data are consistent with multiple potential predictors (set-identified). To address this, we introduce latent equivalent classes (LECs). LECs are defined as groups of latent confounders that induce the same conditional proxy distribution. We show that point-identification for the robust predictor remains achievable as long as multiple domains differ sufficiently in how they mix proxy-induced LECs to form the robust predictor. This domain diversity condition is formalized as a cross-domain rank condition on the mixture weights, which is substantially weaker assumption than completeness. We introduce the Proximal Quasi-Bayesian Active learning (PQAL) framework, which actively queries a small, targeted set of diverse domains that satisfy this rank condition. PQAL can recover the point-identified predictor, demonstrates robustness to varying degrees of shift and outperforms previous methods on synthetic data and semi-synthetic dSprites, IHDP, ACS Folktables datasets.

阅读与讨论 → 访问原文 →

06.

arXiv (CS.CV) 2026-06-24 DOI: arXiv:2606.24156

Accelerating Multimodal Large Language Models with Prior-Corrected Token Reduction

作者:

Zengjie Chen ↗Yuxiang Cai ↗Jingcai Guo ↗Taotao Cai ↗Jianwei Yin ↗Zhi Chen ↗

Visual token reduction has emerged as an effective strategy for accelerating Multimodal Large Language Models (MLLMs). Many existing methods prune tokens by ranking text-visual attention scores. However, we show that attention is often dominated by a model-induced prior: even without textual instruction, MLLMs tend to focus on certain task-agnostic regions. Consequently, the attention scores of instruction-conditioned tokens are suppressed, increasing the risk that these tokens are discarded during pruning. To address this issue, we propose Prior-Corrected Token Reduction (PriorTR), a training-free token reduction method that explicitly separates task-conditioned attention from the model-induced prior. PriorTR estimates the attention map of the prior, and contrasts it with the task-conditioned attention distribution to measure the additional usable information contributed by each visual token. Importantly, PriorTR computes both the model-induced prior and the task-conditioned posterior within a single forward pass by introducing a null token that serves as an instruction-agnostic probe in the attention block. This design avoids duplicated propagation. Extensive experiments across multiple multimodal benchmarks and MLLMs demonstrate that PriorTR consistently improves the trade-off between accuracy and efficiency over strong training-free baselines, particularly under aggressive token budgets.

阅读与讨论 → 访问原文 →

07.

arXiv (CS.AI) 2026-06-15 DOI: arXiv:2606.13704

Position: AI Must Become Planet-Centered, Not Just Human-Centered

作者:

Maria Perez-Ortiz ↗

arXiv:2606.13704v1 Announce Type: cross Abstract: This position paper argues that contemporary AI paradigms are insufficient for supporting complex global goals and introduces Planet-Centered AI (PCAI) as a design philosophy and research agenda that reorients AI toward planetary-scale socio-ecological systems and their long-term trajectories. A planet-centered approach is grounded in systems thinking, treating Earth as an interconnected whole of which humans are part. We diagnose recurring limitations across AI frameworks, many of which remain human-centered, and show why these become especially consequential under current planetary conditions characterized by systemic risk, non-stationarity, and deep uncertainty. We then articulate how PCAI reshapes the AI lifecycle, from problem formulation and model design to evaluation and deployment, by emphasizing alignment with global agendas, developing system-aware AI foundations, trajectory-oriented evaluation, and monitorability. Finally, we advance a falsifiable claim: AI systems optimized without explicit consideration of systemic consequences are more likely to exacerbate systemic instability than to mitigate it.

阅读与讨论 → 访问原文 →

08.

arXiv (CS.AI) 2026-06-11 DOI: arXiv:2606.11634

Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning

作者:

Kai Liu ↗Peijie Dong ↗Xinchen Xie ↗Jianfei Gao ↗Qipeng Guo ↗Xiaowen Chu ↗Shaoting Zhang ↗Kai Chen ↗

arXiv:2606.11634v1 Announce Type: new Abstract: The rapid progress of reasoning and agentic large language models (LLMs) has increased the demand for long-context inference, but self-attention (SA) scales quadratically with context length. To address this, we study SWARR (Sliding-Window Attention with Reinforced Adaptation for Math Reasoning), a practical recipe for adapting SWA models to mathematical reasoning. SWARR has two stages: (1) efficient conversion from a pretrained SA model to SWA with supervised fine-tuning (SFT), which avoids pretraining a new base model, and (2) policy adaptation with reinforcement learning (RL). We find that SWA still underperforms SA after SFT, and we hypothesize that this gap is caused in part by a data-architecture mismatch: most SFT data are prepared for SA models and may contain long-range dependencies that are difficult for SWA to model. Because on-policy RL optimizes self-generated trajectories under the SWA constraint, it can adapt trajectories to better match SWA. Experiments on mathematical reasoning benchmarks show that this recipe substantially narrows the gap between SWA and SA, recovering much of the accuracy lost during SWA conversion while preserving the efficiency benefits of linear-complexity attention. Our central contribution is the empirical finding that RL changes the conclusion one would draw from conversion and SFT alone about SWA's viability for math reasoning.

阅读与讨论 → 访问原文 →

09.

arXiv (quant-ph) 2026-06-15 DOI: arXiv:2606.13779

Computational regimes in matrix-product-state-based quantum trajectory simulations

作者:

Aaron Sander ↗Simon Cichy ↗Martin Eigel ↗Jens Eisert ↗Maximilian Fr\"ohlich ↗Tom Peham ↗Robert Wille ↗

arXiv:2606.13779v1 Announce Type: new Abstract: Efficient simulation of open quantum systems is central to modeling noisy quantum hardware and many-body dynamics. In trajectory-based tensor network methods, cost is often associated with trajectory-level quantities such as entanglement growth or bond dimension. However, the total cost of a fixed-accuracy simulation also depends on statistical sampling, and the interplay between per-trajectory complexity and sampling effort remains poorly understood. Here we introduce a cost-resolved framework for matrix product state (MPS)-based quantum trajectory simulations that decomposes total cost into memory per trajectory, runtime per trajectory, and sampling effort. We show that physically equivalent stochastic unravelings of the same Lindblad dynamics do not necessarily reduce total cost, but instead redistribute cost between trajectory complexity and statistical convergence. This trade-off is quantified by two dimensionless inflation factors: a bond dimension inflation $\alpha$ and a sampling inflation $\kappa$, which together determine the preferred unraveling under hardware-dependent memory and parallelism constraints. We provide a practical protocol for extracting $(\alpha,\kappa)$ from modest pilot simulations and demonstrate it using benchmarks across multiple noise channels. The resulting decision maps show that the computationally favorable unraveling can change with noise strength, time-step resolution, system size, and available parallelism. These results establish unraveling choice as a hardware-aware simulation design problem rather than an intrinsic optimization of trajectory entanglement alone.

阅读与讨论 → 访问原文 →

10.

arXiv (CS.CL) 2026-06-11 DOI: arXiv:2602.10908

SoftMatcha 2: A Fast and Soft Pattern Matcher for Trillion-Scale Corpora

作者:

Masataka Yoneda ↗Yusuke Matsushita ↗Go Kamoda ↗Kohei Suenaga ↗Takuya Akiba ↗Masaki Waga ↗Sho Yokoi ↗

We present SoftMatcha 2, an ultra-fast and flexible search algorithm that enables search over trillion-scale natural language corpora in under 0.3 seconds while allowing semantic variations in the form of substitution, insertion, and deletion. Our approach employs string matching based on suffix arrays that scales well with corpus size, and represents words as vectors, which underpin its semantic flexibility. To mitigate the combinatorial explosion induced by the semantic relaxation of queries, our method is built on two key algorithmic ideas: dynamic corpus-aware pruning and fast exact lookup enabled by a disk-aware design. We theoretically analyze the efficiency of the proposed method, indicating that it can mitigate exponential growth in the search space. Empirically, on FineWeb-Edu (Lozhkov et al., 2024) (1.4T tokens), it attains substantially lower search latency than existing methods: infini-gram (Liu et al., 2024), infini-gram mini (Xu et al., 2025), and SoftMatcha (Deguchi et al., 2025). As a practical application, our method uncovers benchmark contamination in training corpora that existing approaches miss, and it also benefits information retrieval and paraphrase detection. We also provide an online demo of fast, soft search across corpora in seven languages.

阅读与讨论 → 访问原文 →

11.

arXiv (CS.CL) 2026-06-17 DOI: arXiv:2606.17861

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

作者:

Tongxu Luo ↗Rongsheng Wang ↗Jiaxi Bi ↗Chenming Xu ↗Zhengyang Tang ↗Jianlong Chen ↗Juhao Liang ↗Ke Ji ↗Shuqi Guo ↗Yuhao Du ↗Fan Bu ↗Wenyu Du ↗…

Game generation is an emerging application of coding agents, requiring models to transform natural-language specifications into playable interactive systems. Unlike traditional coding tasks, game generation takes place within a game engine, where scripts, scenes, assets, rendering, and runtime interactions must jointly produce coherent gameplay. We formalize end-to-end game generation as the problem of producing a complete game artifact that realizes a specification through observable player-game interaction in a target environment. We argue that evaluating this setting requires three desiderata: Engine Grounding, Artifact Completeness, and Interactive Verification. We propose an interaction-grounded evaluation framework that assesses executable gameplay through replayed demonstrations and rubric-guided multimodal judging. We instantiate this framework as GameCraft-Bench, a benchmark comprising 140 Godot tasks across 15 game families. Evaluations of frontier coding agents show that end-to-end game generation remains highly challenging: the strongest agent achieves only 41.46%, and most agents score below 40%. Further analysis reveals that while agents often implement recognizable mechanics, they struggle to deliver complete games with sufficient content, functional visual feedback, and coherent presentation. See https://tongxuluo.github.io/gamecraft-bench-website for demos, code, and data.

阅读与讨论 → 访问原文 →

12.

arXiv (CS.CV) 2026-06-16 DOI: arXiv:2606.17020

FusionRS: A Large-Scale RGB-Infrared Remote Sensing Dataset for Dual-Modal Vision-Language Foundation Models

作者:

Jiaju Han ↗Ben Zhang ↗Xuemeng Sun ↗Qike Zhang ↗Yuxian Dong ↗Chengyin Hu ↗Fengyu Zhang ↗Yiwei Wei ↗Jiujiang Guo ↗

Remote sensing vision-language models have advanced Earth observation understanding, but most existing work remains centered on RGB imagery, leaving the complementary information in infrared data underexplored. Infrared images provide distinctive cues, including thermal intensity structures, object boundaries, and illumination-invariant scene features, which can enrich visual-language learning beyond conventional RGB observations. However, a large-scale RGB-infrared-text dataset for remote sensing vision-language modeling is still absent. To address this gap, we introduce FusionRS, the first large-scale RGB-infrared-text dataset designed for dual-modal vision-language learning in remote sensing. FusionRS is constructed by translating diverse public RGB remote sensing images into infrared-style counterparts, forming aligned RGB-IR image pairs. Each pair is associated with conventional scene captions and IR-aware captions that explicitly describe infrared-specific visual properties while preserving semantic content. Based on FusionRS, we train dual-modal vision-language foundation models for RGB-IR joint understanding. We first train CLIP-style models for RGB-IR-text alignment, and then fine-tune generative VLMs for dual-modal RGB-IR captioning. Experiments show that FusionRS improves RGB-IR alignment, infrared-to-text retrieval, and dual-modal captioning over RGB-only and non-IR-aware training settings. Ablation studies further verify that IR-aware captions are crucial for strengthening infrared-language alignment, highlighting the importance of modality-specific textual supervision for more scalable RGB-infrared remote sensing vision-language representation learning.

阅读与讨论 → 访问原文 →

13.

Nature Medicine 2026-06-17 DOI: HASH:ccdda145cb3f3cb69cff4cc13eb57bb3

Why large-scale randomized trials of live-attenuated shingles vaccination for dementia prevention are urgently needed

作者:

Pascal Geldsetzer ↗

In my view, we have never had as robust a body of evidence from observational data on an intervention for dementia as we do for live-attenuated shingles vaccination. Both a recent US National Institutes of Health expert workshop and an international expert consensus on Alzheimer’s disease drug repurposing identified large-scale randomized trials of shingles vaccination for dementia prevention as the crucial next step for the field.

阅读与讨论 → 访问原文 →

14.

arXiv (CS.AI) 2026-06-15 DOI: arXiv:2606.13692

An Agentic Retrieval Framework for Autonomous Context-Aware Data Quality Assessment

作者:

Hadi Fadlallah ↗Ibrahim Dhaini ↗Fatima Mubarak ↗Rima Kilany ↗

arXiv:2606.13692v1 Announce Type: cross Abstract: Data quality assessment is a critical prerequisite for effective data analytics and data-driven decision-making, yet it remains a challenging task due to the inherently context-dependent nature of data quality. Existing approaches often rely on static rules or manual assessment strategies, limiting their adaptability to diverse usage scenarios and constraining automation at scale. Recent advances in artificial intelligence, particularly large language models, offer new opportunities for automating data quality assessment, but raise concerns related to reliability, grounding, and execution safety. In this paper, we propose a unified agentic-retrieval framework for autonomous context-aware data quality assessment. The framework interprets natural-language descriptions of intended data usage, derives context-aware assessment strategies, and generates executable validation logic through a multi-agent workflow. To ensure operational reliability, the framework introduces a feasibility validation stage that evaluates the realism and executability of generated assessment specifications before execution, enabling iterative refinement when necessary. Accepted validation logic is executed deterministically to guarantee reproducible and auditable results. We implement the proposed framework as an end-to-end prototype and evaluate it across multiple usage scenarios applied to the same dataset. The results demonstrate that assessment outcomes adapt meaningfully to different intended uses, while feasibility-gated execution reduces unrealistic or non-executable rule generation. The proposed approach provides a practical foundation for deploying autonomous yet controlled data quality assessment in modern data-driven environments.

阅读与讨论 → 访问原文 →

15.

arXiv (CS.AI) 2026-06-24 DOI: arXiv:2606.24472

G$^3$VLA: Geometric inductive bias for Vision-Language-Action Models

作者:

Yue Peng ↗Yongzhe Zhao ↗Artur Habuda ↗Khuyen Pham ↗Yanheng Zhu ↗Tran Nguyen Le ↗Fares Abu-Dakka ↗Li Guo ↗

arXiv:2606.24472v1 Announce Type: cross Abstract: Vision-language-action (VLA) models have made rapid progress in generalist robot manipulation by harnessing semantic knowledge from pretrained vision-language backbones, but their visual tokens remain grounded in 2D image coordinates rather than the calibrated geometry of the robot's cameras – a mismatch especially pronounced in multi-camera setups, where views are coupled by known intrinsics and extrinsics yet processed as independent images. We propose G$^3$VLA, a camera-aware geometric module that injects calibrated structure into the visual-token stream of a pretrained VLA without altering its action space or imitation objective, combining intrinsic-conditioned ray embeddings, projective positional encoding (PRoPE), and bidirectional cross-view fusion. Geometric supervision is provided either from ground-truth point maps when available, or from confidence-gated $\pi^3$X teacher predictions, requiring no depth sensors or manual annotations. Instantiated on $\pi_0$, G$^3$VLA yields consistent gains across the LIBERO suites, RoboCasa24, RoboTwin2.0, and real-robot settings, with the largest improvements on spatially and object-sensitive tasks. We further validate on $\pi_{0.5}$ and GR00T 1.5, with results suggesting that geometric transfer is most effective when geometry-aware tokens have direct access to the action generation pathway. Our project page is at https://sites.google.com/view/g3vla

阅读与讨论 → 访问原文 →

16.

Nature (Science) 2026-06-23 DOI: HASH:7b1d363bf17c5120a614be115ea9c231

The halo effect: how academic hierarchy undermines peer review and enables fraud

作者:

Wenfeng Wang ↗

Letter to the Editor

阅读与讨论 → 访问原文 →

17.

arXiv (CS.LG) 2026-06-15 DOI: arXiv:2412.03716

A Water Efficiency Dataset for African Data Centers

作者:

Noah Shumba ↗Opelo Tshekiso ↗Pengfei Li ↗Giulia Fanti ↗Shaolei Ren ↗

arXiv:2412.03716v3 Announce Type: replace Abstract: Artificial intelligence (AI) computing and data centers consume large amounts of freshwater, both directly for cooling and indirectly for electricity generation. While most attention has been paid to developed countries such as the U.S., this paper presents the first-of-its-kind dataset that combines nation-level weather and electricity generation data to estimate water usage effectiveness for data centers in 41 African countries across five different climate regions. We also use our dataset to evaluate and estimate the water consumption of inference on two large language models (i.e., Llama-3-70B and GPT-4) in 11 selected African countries. Our estimates suggest that writing a 10-page report using Llama-3-70B could consume as much as {0.66 liters} of water, while the water consumption by GPT-4 for the same task may go up to about {59 liters}. For writing a medium-length email of 120-200 words, Llama-3-70B and GPT-4 could consume about {0.13 liters} and {2.9 liters} of water, respectively. All the numbers for generative model inference tasks are based on public information available in 2024, when we initially prepared the analysis. Since then, AI inference systems have improved substantially. For example, recent disclosures suggest that energy efficiency improved by more than 30x between May 2024 and May 2025. Accordingly, our 2024 estimates should be interpreted as historical reference values rather than as representative of current performance. Interestingly, given the same AI model, 9 of the 11 selected African countries consume less water than the global average, mainly because of lower water intensities for electricity generation.

阅读与讨论 → 访问原文 →

18.

arXiv (math.PR) 2026-06-24 DOI: arXiv:2606.24703

Scheduling jobs with unknown size distribution in a M/G/1 queue: the shifted empirical Gittins

作者:

Nicolas Gast ↗Bruno Gaujal ↗Adrien Obrecht ↗

arXiv:2606.24703v1 Announce Type: new Abstract: In this paper we consider a M/G/1 queue for which we want to minimize the expected response time. We show how to compute indices from $n$ samples of the job size distribution such that the corresponding index policy is asymptotically optimal as $n$ grows. This construction is based on a discretization of the bounded support of the job size distribution and a shift of the samples to their nearest discrete point to the right. We show that the Gittins index of the empirical distribution of these shifted samples is close to the Gittins index of the original distribution. This translates to the asymptotic optimality of the corresponding index policy for minimizing the expected response time. Numerical comparison with other approaches further confirm the efficiency of our approach.

阅读与讨论 → 访问原文 →

19.

Nature (Science) 2026-06-10 DOI: HASH:aac7517d04bfd73b8fd4bf97a3207e83

Mitochondria tethered to the nucleus secure its energy supply

作者:

Philippa Clarke ↗

Direct interactions between the cell’s powerhouses and nuclear pores might channel energy straight into the nucleus, fuelling cell division and differentiation. Direct interactions between the cell’s powerhouses and nuclear pores might channel energy straight into the nucleus, fuelling cell division and differentiation.

阅读与讨论 → 访问原文 →

20.

arXiv (CS.CV) 2026-06-16 DOI: arXiv:2606.16203

DynFS-MoE: Dynamic Functional-Structural Mixture-of-Experts for Post-Traumatic Epilepsy Diagnosis

作者:

Jun-En Ding ↗Spencer Chen ↗Henry Noren ↗Daniel Valdivia ↗Christine Yohn ↗Suhina Patel ↗Taylor Zink ↗Hai Sun ↗Feng Liu ↗

Post-traumatic epilepsy (PTE) is a severe complication of traumatic brain injury (TBI), yet early identification remains challenging due to the complex structural and functional alterations it induces in the brain. To address this, we propose a dynamic multimodal Mixture-of-Experts (MoE) framework that integrates functional and structural MRI through time-aware functional-structural encoding and class-conditioned expert routing. Within this framework, modality-specific and cross-modal experts learn complementary representations, while a Modality-Class MoE (MCoE) module dynamically dispatches expert weights according to each classification objective. Experimental results across three binary classification tasks demonstrate that the framework consistently outperforms static fusion baselines, and high-interpretability analyses further reveal meaningful region-of-interest (ROI) interactions. This dynamic multimodal expert framework effectively captures class-dependent brain interaction patterns and provides an interpretable approach for PTE diagnosis and risk stratification.

阅读与讨论 → 访问原文 →

21.

arXiv (CS.LG) 2026-06-12 DOI: arXiv:2606.13295

Simultaneous Latent Budget Trees for Stratified Classification

作者:

Simultaneous Latent Budget Trees for Stratified Classification Cristian Buoncompagni ↗Stefano Pellegrino ↗Giulia Vannucci ↗Roberta Siciliano ↗

arXiv:2606.13295v1 Announce Type: cross Abstract: In the era of Explainable Artificial Intelligence, there is a renewed focus on single trees for their ease of interpretation. This paper introduces Simultaneous Latent Budget Trees, a probabilistic machine learning framework for classification trees in the presence of a stratification factor such as a temporal, spatial, or demographic variable, acting as a control variable or potential confounder. Standard tree growth procedures are not designed to optimize a conditional split rule. A model-based split rule is proposed in which child nodes are interpreted as latent components of a simultaneous mixture model, such as the Simultaneous Latent Budget Model and its constrained versions, fitted to the parent node. Mixing parameters drive the observations, differently for each group, to the child nodes whereas latent budgets parameters update the response classes profile of each level of the control variable. Parameters are estimated by least squares considering a neural network perspective of the model. An informative tree structure can be interactively visualized with interpretation aids on the node and the paths, including visual pruning and decision tree selection procedure. Suitable measures are proposed to handle an unbalanced response class distribution. The proposed methodology is applied to investigate gender-related differences in disease progression of Amyotrophic Lateral Sclerosis. The SLBT library with the various tree-based algorithms is available in the linked GitHub repository.

阅读与讨论 → 访问原文 →

22.

arXiv (CS.AI) 2026-06-16 DOI: arXiv:2606.15376

CoAgent: Concurrency Control for Multi-Agent Systems

作者:

Hongtao Lyu ↗Dingyan Zhang ↗Mingyu Wu ↗Xingda Wei ↗Haibo Chen ↗

arXiv:2606.15376v1 Announce Type: cross Abstract: Multi-agent LLM systems – coding agents, devops agents, document agents – now routinely run several agents in parallel against the same git tree, Kubernetes cluster, or document. As soon as two of them mutate shared state, they enter the regime classical concurrency control has studied for decades, but classical mechanisms fit LLM agents poorly. A single agent transaction spans minutes of inference, read sets are broad and opaque rather than statically inferable, and the live state agents act on admits neither fork nor buffer, so writes take effect the moment they execute. Locks block long inference intervals; OCC abort-and-retry discards minutes of work on every conflict. This paper builds concurrency control on a capability classical transactions lack: the LLM inside each agent can judge whether a conflicting write invalidates its plan, and can repair exactly the operations that depended on it. Control therefore turns advisory: the runtime informs, the agent repairs. Our protocol, MTPO (Monotonic Trajectory Pre-Order), fixes a serialization order at launch, serves each read the order-filtered value, and applies writes speculatively in place; a one-way notification asks an affected reader to re-judge and patch its plan, while the framework mechanically undoes and reorders misplaced writes through the saga-style inverse each tool registers in advance. At quiescence the run is serializable in the pre-decided order. We realize MTPO as CoAgent, toolcall middleware whose privileged ToolSmith grows footprint-declared, undoable tools online. On ten contended workloads, CoAgent stays within 5\% of serial correctness at a $1.4\times$ speedup and near-serial token cost, where 2PL and OCC surrender nearly all concurrency gains; on a bash-only target system, it grows a 25-tool library online and lifts the task pass rate from 45/71 to 63/71 at $0.80\times$ the time and $0.86\times$ the cost.

阅读与讨论 → 访问原文 →

23.

arXiv (CS.AI) 2026-06-16 DOI: arXiv:2606.14935

PrologMCP: A Standardized Prolog Tool Interface for LLM Agents

作者:

Agnieszka Mensfelt ↗Adarsh Prabhakaran ↗Adrian Haret ↗Vince Trencsenyi ↗Kostas Stathis ↗

arXiv:2606.14935v1 Announce Type: new Abstract: Frontier reasoning-tuned language models still fail on deductive tasks at depth, and the cost of improved performance through extended internal reasoning scales poorly. Symbolic delegation offers a complementary route: a language model translates the problem, while a solver performs the inference. However, current autoformalization pipelines for logic programming are typically bespoke integrations tied to particular tasks or agents. We introduce PrologMCP, a task-agnostic, open-source server that exposes Prolog as a stateful tool through the Model Context Protocol (MCP). Its compact tool interface, structured error reporting, and per-session isolation make the translate-run-inspect-repair loop a reusable primitive for MCP-capable agents. We evaluate a formalizer agent enhanced with PrologMCP against standard and reasoning LLMs (Claude Sonnet 4.6, GPT-4.1, and o4-mini) on two subsets of PARARULE-Plus: a general-purpose sample and a more challenging one targeting a specific failure mode of natural-language reasoning. On the general sample, the formalizer matches or exceeds reasoning LLMs (accuracy 1.00 vs.\ 1.00 / 0.998), with the largest gains over standard models (0.762 for GPT-4.1). On the challenging subset, the formalizer remains near-perfect (1.00 / 0.99) while reasoning LLMs drop to 0.95 / 0.94. These results suggest that delegating inference to Prolog via MCP is a robust and inspectable alternative to extended natural-language reasoning.

阅读与讨论 → 访问原文 →

24.

arXiv (CS.CV) 2026-06-12 DOI: arXiv:2606.13376

MoVerse: Real-Time Video World Modeling with Panoramic Gaussian Scaffold

作者:

Yang Zhou ↗Ziheng Wang ↗Yuqin Lu ↗Haofeng Liu ↗Jun Liang ↗Shengfeng He ↗Jing Li ↗

We present MoVerse, a real-time video world model that creates an interactively navigable scene from a single narrow-field-of-view image. This setting is challenging because the input observes only a small fraction of the environment, while interactive roaming requires a complete surrounding world, persistent geometry, controllable camera motion, and temporally coherent high-fidelity observations. MoVerse addresses this problem by separating world construction from observation rendering. It first expands the input into a gravity-aligned 360$^\circ$ panorama with topology-aware diffusion, closing the missing field of view before 3D reasoning. It then lifts the panorama into a persistent 3D Gaussian scaffold using panoramic geometry-aware residual prediction, yielding a dense and directly renderable spatial memory. Finally, a Gaussian-conditioned video renderer translates scaffold renderings along user-specified camera trajectories into photorealistic video. To make this renderer practical for interaction, we train a bidirectional diffusion teacher for high-quality conditional rendering and distill it into a causal autoregressive student for bounded-latency streaming. This design combines the controllability and long-range consistency of explicit 3D representations with the perceptual quality of generative video models. MoVerse supports real-time scene roaming at 8~FPS on a single NVIDIA RTX~4090 GPU, demonstrating a practical path toward single-image world creation with interactive video output.

阅读与讨论 → 访问原文 →

25.

medRxiv (Medicine) 2026-06-10 DOI: HASH:b4f09633d02da1de5a3f984adb611950

Towards the Virtual Amyotrophic Lateral Sclerosis Patient: Inferring Cortical Excitability through Whole-Brain Dynamical Modeling

作者:

Angiolelli ↗Demuru ↗Lopez ↗E. T ↗Hashemi ↗Ziaeemeh ↗Rabuffo ↗Trojsi ↗Granata ↗Tafuri ↗De Luca ↗Gallo ↗…

Amyotrophic lateral sclerosis (ALS) is increasingly recognized as a multisystem neurodegenerative disorder in which motor-neuron degeneration is accompanied by widespread alterations in cortical dynamics. Among its most reproducible neurophysiological signatures is cortical hyperexcitability, yet how this local excitability imbalance shapes distributed whole-brain activity remains poorly understood. Here, we combined source-reconstructed resting-state MEG data, tractography-informed whole-brain modeling, and simulation-based inference to investigate whether ALS-related alterations in large-scale brain dynamics can be mechanistically explained by changes in cortical excitability. First, we characterized empirical brain dynamics using complementary features spanning regional activity amplitude and variability, functional connectivity, and avalanche-based metrics. These analyses revealed significant alterations in ALS patients relative to healthy controls, as well as associations with clinical impairment and disease staging. To mechanistically interpret these changes, we employed a reduced Wong-Wang whole-brain model in which local recurrent excitation modulates emergent large-scale neural dynamics. Simulations showed that increasing excitability systematically reproduced the empirical dynamical signatures observed in ALS. We then applied a simulation-based inference framework to estimate latent excitability parameters directly from empirical observations. Whole-brain model inversion revealed increased excitability in ALS patients compared with controls. The recovered excitability parameter was associated with disease staging, supporting its clinical relevance as a model-derived descriptor of ALS progression. Finally, by extending the model to estimate frontal and non-frontal excitability separately, we found that ALS-related alterations were predominantly associated with increased frontal excitability, whereas non-frontal regions appeared comparatively less affected. The recovered parameters related to disease staging. Together, these findings provide a mechanistic framework linking altered large-scale brain dynamics in ALS to selective cortical hyperexcitability, explaining how local excitability changes can give rise to global network reorganization. More broadly, they show how computational model inversion can recover latent multiscale pathophysiological processes from empirical neural recordings, offering a non-perturbative alternative to complex experimental paradigms typically required to causally probe local-to-global mechanisms.

阅读与讨论 → 访问原文 →

探索全球前沿学术脉络