Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-17

FlowRAG: Synergizing Explicit Reasoning via Frequency-Aware Multi-Granularity Graph Flow

arXiv:2606.17856v1 Announce Type: new Abstract: Graph-based retrieval-augmented generation (GraphRAG) is effective for knowledge-intensive and multi-hop query tasks; however, many existing methods primarily seed entity-based graphs and rely on implicit semantic relevance propagation. This often (i) under-retrieves when user queries are abstract and semantically sparse at the entity level, and (ii) suffers from brittle multi-hop reasoning, where noisy activations can derail entity-to-entity transitions and corrupt the inferred relation chain, yielding unreliable conclusions. To this end, we propose \texttt{FlowRAG}, a semantic-aware retrieval framework that improves both semantic recall and explicit reasoning. Specifically, \texttt{FlowRAG} constructs a quad-level heterogeneous graph over passages, summaries, sentences, and entities, where summary nodes serve as a coarse semantic hub. At retrieval time, a dual-granularity activation module combines summary–query alignment with sentence-level matching to activate relevant entities under paraphrase and abstraction robustly. We then introduce a frequency-aware weighted flow module that routes relevance through entity–passage links weighted by within-passage term frequency, pruning noisy connections and extracting high-confidence reasoning paths as an explicit logic skeleton for generation. Extensive experiments show that \texttt{FlowRAG} obtains state-of-the-art performance on complex reasoning benchmarks.

02.
arXiv (CS.CV) 2026-06-11

Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks

With the development of video understanding, there is a proliferation of tasks for clip-level temporal video analysis, including temporal action detection (TAD), temporal action segmentation (TAS), and generic event boundary detection (GEBD). While task-specific video understanding models have exhibited outstanding performance in each task, there remains a dearth of a unified framework capable of simultaneously addressing multiple tasks, which is a promising direction for the next generation of AI. To this end, in this paper, we propose a single unified framework, coined as Temporal2Seq, to formulate the output of these temporal video understanding tasks as a sequence of discrete tokens. With this unified token representation, Temporal2Seq can train a generalist model within a single architecture on different video understanding tasks. In the absence of multi-task learning (MTL) benchmarks, we compile a comprehensive co-training dataset by borrowing the datasets from TAD, TAS, and GEBD tasks. We evaluate our Temporal2Seq generalist model on the corresponding test sets of three tasks, demonstrating that Temporal2Seq can produce reasonable results on various tasks and achieve advantages compared with single-task training on this framework. We also investigate the generalization performance of our generalist model on new datasets from different tasks, which yields superior performance to the specific model.

03.
medRxiv (Medicine) 2026-06-15

Toward a National Registry for Inborn Errors of Immunity in Peru: A Qualitative Implementation Study

Background: Peru lacks an integrated information system for patients with Inborn Errors of Immunity (IEI). Although disease registries are essential tools for data management and health planning, their success depends on implementation science approaches that account for local contextual factors. This study reports Phase I of a three-phase mixed-methods implementation project to design and develop a national IEI registry. Methods: Phase I consisted of a phenomenological qualitative study exploring stakeholder perspectives. Semi-structured focus groups and in-depth interviews were conducted with 29 key stakeholders across four groups: policy-makers, clinical experts, end-users (immunologists, residents, allied health personnel), and patient organization representatives. Interviews followed a guide structured around four a priori domains (structure, navigation, feasibility, and perception of existing systems). Discussions were conducted in Spanish, audio-recorded, transcribed verbatim, and coded using ATLAS.ti. A hybrid thematic analysis combining deductive and inductive coding was performed. Data elements proposed for the registry were triangulated with qualitative findings. Results: Thirty-six initial codes were consolidated into 15 categories, which were further integrated into four overarching themes conceptualized as pathways toward intention to use: (1) Environment, where governance, regulatory backing, and sustainable financing were identified as key enablers, while limited interoperability emerged as a structural barrier; (2) Technical Dimension, emphasizing usability, alignment with clinical workflow, and a hierarchical data architecture (demographic, clinical, therapeutic); (3) Users, highlighting clinical leadership, protected time, digital readiness, and perceived usefulness as stronger motivators than financial incentives; and (4) Patients, underscoring data protection, transparency, trust, and advocacy as essential for legitimacy and sustainability. Conclusions: A national IEI registry in Peru is perceived as necessary and feasible if implemented with strong regulatory foundations, interoperable design, robust data security, and user-centered architecture. These findings informed the development of an initial functional prototype and the operational plan for Phase II, focused on usability evaluation.

04.
arXiv (CS.CV) 2026-06-17

SegTME-UNI2: A Foundation Model-Based Framework for Generalisable Multiclass Cell Segmentation and LLM-Driven Tumour Microenvironment Characterisation in Histopathology

Characterising the tumour microenvironment (TME) from routine H&E-stained histology images requires simultaneous cell segmentation, feature extraction, and interpretable clinical reporting. We present SEGTME-UNI2, a unified framework addressing these requirements. Its core is UNI2-UPERHOVER, a dual-head segmentation model pairing the UNI2-H pathology foundation model (ViT-Giant, pretrained on >100M tiles from 100K slides) with two parallel UperNet decoders: one for six-class semantic segmentation and one for horizontal-vertical gradient regression enabling watershed-based nuclear instance separation. To address the lack of pixel-level annotations in large real-world repositories, UNI2-UPERHOVER undergoes a three-stage progressive pseudo-label curriculum. Each stage trains a fresh model without weight transfer, driving improvement entirely via increased pseudo-label quality: Stage 1: Uses human-annotated PanNuke (7,901 images, 189,744 nuclei, 0.25 um/pixel). Stage 2: Uses entropy-filtered pseudo-labels from the Stage 1 model on 271,711 TCGA-UT scale-0 patches (0.5 um/pixel). Stage 3: Uses pseudo-labels from the Stage 2 model on all 1,608,060 TCGA-UT patches across six resolution scales (0.5-1.0 um/pixel). Segmentation outputs feed a structured TME feature extraction pipeline computing 20+ per-patch compositional, morphological, spatial entropy, and intercellular distance metrics. These are encoded as JSON and passed to a fine-tuned NVIDIA BioNeMo GPT model to generate clinically interpretable TME narratives. Preliminary validation on held-out PanNuke and TCGA-UT partitions demonstrates framework feasibility and internal consistency. The pseudo-labelled TCGA-UT dataset and UNI2-UPERHOVER checkpoint are publicly released to support large-scale TME profiling and spatial biology research.

05.
arXiv (CS.AI) 2026-06-18

MIDS: Detecting Stealthy Masquerade and Tampering Attacks on CAN Bus via Bidirectional Mamba

arXiv:2606.18599v1 Announce Type: cross Abstract: The Controller Area Network (CAN) protocol is the primary communication standard for Electronic Control Units (ECUs) in modern vehicles, but its lack of encryption and authentication exposes it to a range of security threats. Existing intrusion detection systems are largely tuned to fabrication-style attacks (DoS, fuzzing, ID spoofing realised by frame injection), in which detection signals such as per-ID inter-arrival statistics are readily available. We instead address the harder masquerade setting[b37], in which an internal adversary substitutes a legitimate frame in-situ at its original transmission slot, preserving traffic periodicity and rendering traffic-statistic defences ineffective. We propose the Mamba Intrusion Detection System (MIDS), an innovative dual-stream framework that processes CAN identifiers and payloads in parallel and reconstructs their joint temporal semantics through bidirectional selective state-space modelling. To evaluate MIDS, we collected over 100 million CAN frames from a physical Tesla Model 3 across three driving regimes and synthesised 54 masquerade attack variants spanning ID-only, data-only, and combined modifications. MIDS attains an F1 of 96.94\% on this dataset, exceeding the strongest reproducible baseline by more than 8 percentage points, while sustaining a 1.147~ms single-window inference latency – ample headroom for real-time onboard deployment. To verify generalisation, we further evaluate MIDS on four public benchmarks (ROAD, CrySyS, OTIDS, CT\&T) covering both masquerade and injection scenarios; MIDS attains F1 from 93.70\% to 99.61\%, outperforming the strongest of eight reproduced baselines by up to 13.94 percentage points under a unified 5-fold protocol.

06.
arXiv (CS.CL) 2026-06-11

ChartFI: Benchmarking Faithfulness and Insightfulness of Chart Descriptions from Multimodal Large Language Models

Chart descriptions are essential for accessibility, cross-modal retrieval, and assisting readers in extracting insights from complex visualizations. As multimodal large language models (MLLMs) are increasingly adopted for automated chart description generation, a critical question arises: how faithfully and insightfully do these models actually describe charts? Current benchmarks fall short on two fronts: existing datasets consist of simple, homogeneous charts paired with shallow, fact-enumerating descriptions; and prevailing metrics fail to capture the multi-faceted nature of description quality. To address these gaps, we present the Chart Faithfulness and Insightfulness Benchmark (ChartFI-Bench). We first summarize four dimensions that characterize high-quality chart descriptions: factual accuracy, salient feature emphasis, domain-informed guidance, and chart-text complementarity. Guided by these dimensions, we construct a high-quality benchmark comprising 896 chart-description pairs, which feature visually complex charts and semantically rich descriptions. Furthermore, we design four aligned evaluation metrics – Faithfulness, Coverage, Informativeness, and Acuity – to systematically assess the quality of descriptions across these dimensions. Experiments conducted on mainstream MLLMs demonstrate the effectiveness of the proposed framework and reveal common weaknesses among existing models.

07.
arXiv (CS.LG) 2026-06-19

Stabilizing Bandits using Regularization: Precise Regret and A Quantitative Central Limit Theorem

arXiv:2603.10184v2 Announce Type: replace-cross Abstract: Statistical inference with bandit data presents fundamental challenges owing to adaptive sampling, which violates the independence assumptions underlying classical asymptotic theory. Recent work has identified stability~\citep{laiwei82} as a sufficient condition for valid inference under adaptivity. This paper first provides a refined stability condition, stated in terms of the iterates of an online algorithm, and shows that a large class of regularized stochastic-mirror-descent-style algorithms satisfy it. This refined condition allows us to strengthen the asymptotic results of~\citet{laiwei82} in several ways. First, we derive a non-asymptotic Berry–Esseen bound for the empirical reward estimates under adaptive sampling. Second, we derive matching non-asymptotic upper and lower bounds on the regret of the proposed algorithm, yielding a precise characterization of its regret. Third, we show that these regularized algorithms preserve asymptotic normality and valid inference under a prescribed level of adversarial corruption. Finally, we show that regularization is necessary rather than incidental: Lai–Wei stability is incompatible with the optimal $O(\sqrt{T})$ regret rate – the rate attained by unregularized algorithms such as EXP3 – so that a controlled, polylogarithmic inflation in regret is the price of valid inference.

08.
arXiv (CS.LG) 2026-06-24

Natural Identifiers for Privacy and Data Audits in Large Language Models

arXiv:2606.24408v1 Announce Type: new Abstract: Assessing the privacy of large language models (LLMs) presents significant challenges. In particular, most existing methods for auditing differential privacy require the insertion of specially crafted canary data during training, making them impractical for auditing already-trained models without costly retraining. Additionally, dataset inference, which audits whether a suspect dataset was used to train a model, is infeasible without access to a private non-member held-out dataset. Yet, such held-out datasets are often unavailable or difficult to construct for real-world cases since they have to be from the same distribution (IID) as the suspect data. These limitations severely hinder the ability to conduct scalable, post-hoc audits. To enable such audits, this work introduces natural identifiers (NIDs) as a novel solution to the above-mentioned challenges. NIDs are structured random strings, such as cryptographic hashes and shortened URLs, naturally occurring in common LLM training datasets. Their format enables the generation of unlimited additional random strings from the same distribution, which can act as alternative canaries for audits and as same-distribution held-out data for dataset inference. Our evaluation highlights that indeed, using NIDs, we can facilitate post-hoc differential privacy auditing without any retraining and enable dataset inference for any suspect dataset containing NIDs without the need for a private non-member held-out dataset.

09.
arXiv (CS.CL) 2026-06-16

Agentic Retrieval and Reinforcement Learned Equation Chains: A Controlled Generation Framework for Complex and Novel Physics Word Problems

Generating high-quality Physics Word Problems (PWPs) that are novel, complex, and solvable remains a challenging and underexplored problem in educational content generation. Existing approaches, many adapted from Math Word Problem (MWP) generation, often produce ambiguous, unsolvable, or structurally simple questions with limited linguistic diversity. We introduce ARVRE (Agentic Retrieval Value Reinforced Equation-chain), a two-stage framework for generating diverse and mathematically valid PWPs. In the first stage, a form of offline temporal-difference learning is used to construct valid chains of physics equations, while an agentic retrieval-augmented generation (RAG) framework dynamically selects topic-specific concepts and vocabulary. This design enables explicit control over problem structure and difficulty. In the second stage, a Large Language Model (LLM) converts the equation chain and retrieved concepts into a natural-language physics question. By grounding generation in valid equation chains, our method preserves mathematical correctness while promoting linguistic diversity and contextual richness. Human and automated evaluations demonstrate that ARVRE generates PWPs that are more complex, novel, and solvable than those produced by existing approaches. These results highlight the potential of combining reinforcement learning, retrieval, and LLMs for reliable generation of educational physics content.

10.
arXiv (CS.AI) 2026-06-15

Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

arXiv:2606.14693v1 Announce Type: cross Abstract: Cooperative multi-objective multi-agent reinforcement learning (MOMARL) models team decision making under multiple, potentially conflicting objectives. In this setting, conflicts arise not only across objectives but also across agents with different observations, roles, and contributions. We propose Preference Coordinated Multi-agent Policy Optimization (PCMA), which learns coordinated agent-specific preferences to enable complementary trade-offs among agents. Theoretically, we formulate cooperative MOMARL as a team-optimal game and show that, under suitable conditions, preference diversity can induce team improvement through a first-order improvement decomposition. Experiments on multiple cooperative MOMA environments and a practical traffic-control scenario show that PCMA improves both performance and trade-off coordination.

11.
arXiv (quant-ph) 2026-06-15

Quantum Horizon: An evaluation of quantum computing as a threat to Bitcoin and Ethereum

arXiv:2606.14484v1 Announce Type: new Abstract: Quantum computing poses a real, broad-based, but bounded and substantially mitigable threat to Bitcoin and Ethereum. We separate the two quantum algorithms that public discussion routinely conflates: Shor's algorithm breaks the elliptic-curve signatures (ECDSA over secp256k1, BLS over BLS12-381) that authorize spending, whereas Grover's algorithm does not meaningfully threaten proof-of-work mining, which is protected by a merely quadratic speedup, fault-tolerant per-operation costs, a square-root parallelization wall, and difficulty adjustment. Folding hardware scaling, the falling resource requirement, a fault-tolerance readiness lag, and expert surveys into a single Monte-Carlo forecast yields a wide, bimodal arrival distribution for a cryptographically relevant quantum computer: about a one-in-six chance by 2035, near 30% by 2040, and about 60% by 2050. Exposure is concentrated and mostly migratable: of Bitcoin's roughly six million quantum-exposed coins only about 2.3 million are irreducibly at risk, while 50 to 65% of Ether sits at key-revealed accounts that can adopt post-quantum signatures. A timely migration beats even an optimistic 2035 machine, so the binding constraint is governance, not technology. A survey of the top twenty cryptocurrencies finds none fully post-quantum. Reproducible models accompany every quantitative claim.

12.
arXiv (quant-ph) 2026-06-12

Achieving Heisenberg limit under noisy conditions with quantum Zeno dynamics and dynamical decoupling

arXiv:2606.13205v1 Announce Type: new Abstract: Quantum Zeno dynamics (QZD) and dynamical decoupling (DD) are useful tools that enable the effective suppression of noise in quantum systems. We consider the problem of when (i) noise can be suppressed and (ii) Heisenberg limit (HL) can be achieved in quantum metrology, and prove necessary and sufficient conditions for when QZD and DD are useful for achieving these two goals. We also show that in the Markovian regime, there are scenarios where preventing errors using QZD/DD may enable HL to be achieved where current QEC methods may not. Finally, we demonstrate that the combination of both techniques can allow individually imperfect QZD and DD strategies to saturate HL.

13.
arXiv (CS.AI) 2026-06-11

APPO: Agentic Procedural Policy Optimization

arXiv:2606.12384v1 Announce Type: cross Abstract: Recent advances in agentic Reinforcement Learning (RL) have substantially improved the multi-turn tool-use capabilities of large language model agents. However, most existing methods assign credit over coarse heuristic units, such as tool-call boundaries or fixed workflows, making it difficult to identify which intermediate decisions influence downstream outcomes. In this work, we study agentic RL from two perspectives: where to branch and how to assign credit after branching. Our pilot analysis shows that influential decision points are broadly distributed throughout the generated sequence rather than concentrated at tool calls, while token entropy alone does not reliably reflect their impact on final outcomes. Motivated by these observations, we propose Agentic Procedural Policy Optimization (APPO), which shifts branching and credit assignment from coarse interaction units to fine-grained decision points in the sequence. APPO selects branching locations using a Branching Score that combines token uncertainty with policy-induced likelihood gains of subsequent continuations, enabling more targeted exploration while filtering out spurious high-entropy positions. It further introduces procedure-level advantage scaling to better distribute credit across branched rollouts. Experiments on 13 benchmarks show that APPO consistently improves strong agentic RL baselines by nearly 4 points, while keeping efficient tool-calls and maintaining behavior interpretability.

14.
PLOS Medicine 2026-05-27

Sequential chemo-immunotherapy followed by standard versus reduced thoracic radiotherapy for older and/or frail stage III non-small-cell lung cancer: A randomized open-label cohort trial

作者:

by Wei-Xiang Qi, Shuyan Li, Mengdi Wang, Huan Li, Feifei Xu, Lei Yao, Biao Yu, Linlin Chen, Gang Cai, Cheng Xu, Xianwen Sun, Zhiyao Bao, Jiayi Chen, Yi Xiang, Shengguang Zhao Background The appropriateness of concurrent chemoradiotherapy (cCRT) for older or clinically vulnerable stage III unresectable non-small-cell lung cancer (NSCLC) patients remains contentious. Furthermore, the survival implications of de-escalating thoracic radiotherapy (RT) intensity in this population have not been conclusively elucidated. Methods and findings We conducted a phase II randomized, open-label, two-cohort (non-comparative) trial at a tertiary hospital in China (NCT05557552). Between September 30, 2022 and April 30, 2024, we enrolled 56 older and/or frail patients with stage III NSCLC who were ineligible for cCRT. The primary endpoint was the 1-year progression-free survival (PFS) rate estimated using the Kaplan–Meier method. Secondary endpoints included objective response rate (ORR), overall survival (OS), and safety. In the intention-to-treat (ITT) set, which included all 56 randomized patients who received at least one dose of study treatment, the 1-year PFS was 84.3% (95% confidence interval [CI] [70.3%, 98.3%]) in the standard RT group and 70.7% (95% CI [54.3%, 87.1%]) in the reduced RT group. In the per-protocol set (53 patients), the 1-year PFS was 82.9% (95% CI [68.9%, 98.8%]) in the standard RT group and 73.4% (95% CI [58.3%, 92.4%]), with a median follow-up of 24 months. Among 56 patients in the safety analysis set, 71.4% of patients experienced grade 3/4 adverse events (AEs) in the standard RT group and 53.6% in the reduced RT group. One patient (3.6%) in the reduced RT and three patients (10.7%) in the standardized RT experienced grade 5 AEs. The main limitations are the non-comparative design, small sample size, and lack of power to establish non-inferiority or superiority. Conclusion The current study suggested that reduced RT combined with sequential chemo-immunotherapy might be feasible for older/frail patients intolerant to cCRT, showing numerically similar survival outcomes. These exploratory findings warrant confirmation in larger, adequately powered randomized trials. Trial registration The trial had been registered on ClinicalTrials.gov on Sep 30, 2022.ClinicalTrials.gov NCT05557552

15.
arXiv (CS.CV) 2026-06-16

Conditional Multi-Event Temporal Grounding in Long-Form Video

Multimodal large language models have made rapid progress in video temporal grounding, yet real-world applications routinely require localizing every event that satisfies compositional temporal and spatial conditions. Existing benchmarks fall short: they localize only a single moment per query, count without temporal conditions, or treat grounding and counting as disjoint tasks. We introduce CoMET-Bench for Conditional Multi-Event Temporal Grounding in long-form video, comprising 2789 queries over 600 videos averaging 33.8 minutes across five real-world domains, with each query composed from 4 temporal conditions, 3 spatial conditions, and a dedicated negative-query subset. We further propose a unified evaluation protocol jointly measuring counting, grounding, and negative-query recognition, including a new Rejection-F1 metric that prevents trivial gaming by lazy "always-empty" models. Benchmarking a broad suite of MLLMs, agent-based, and grounding-specialized methods reveals that existing approaches remain far from solving this task. Building on these findings, we propose CoMET-Agent, a training-free agentic framework that reformulates the task as structured search-and-aggregate, improving F1@0.5 by 6.1% over GPT-5 purely through structural reasoning. Failure analysis further surfaces three open directions: fine-grained entity tracking, position-uniform retrieval, and causal event pairing.

16.
medRxiv (Medicine) 2026-06-15

An epidemiological scenario for Mass Events During the World Cup

This brief work discusses potential superspreading events that may occur during the World Cup in Mexico. The study is particularly focused on the city of Guadalajara due to a large recent outbreak in January and February and insufficient vaccine coverage prior to 2026. Keywords: Superspreading; measles outbreak; branching process; individual reproduction number; World Cup

17.
arXiv (CS.LG) 2026-06-17

Damage Adaptation in Seconds for Architected Materials

arXiv:2606.17394v1 Announce Type: cross Abstract: Adaptation to damages and in-situ physical repairs is essential for long-term robot autonomy, yet challenging outside of narrowly defined and well-anticipated bounds. In this work we proprioceptively adapt to catastrophic damage in soft-actuated systems in under one minute. Architected materials are well equipped for adaptation: actuator failure occurs gradually rather than acutely, and damage can be described in a low-dimensional, discrete coordinate space. Surprisingly, latent damage representations plus a simple yet robust ensemble method is sufficient for adapting to unseen damage in real-time. Moreover, we identify conditions under which exponential sample complexity collapses to linear sample complexity for learned representations of architected materials, a concrete advantage over rigid components or continuum soft mechanisms. We demonstrate LEAP, our method for adaptive proprioception, via a tracing task for a 6DoF soft wrist based on Handed Shearing Auxetic (HSA) actuators. Our algorithm is able to adapt to cuts, burns, and actuator repairs, enabling simulation-free real-time adaptation that is critical for realizing the promise of soft robots outside the lab. Videos and more information are available at https://murpheylab.github.io/leap.

18.
arXiv (math.PR) 2026-06-17

Critical spectral behavior and large deviations for geometric $\alpha$-stable processes

arXiv:2606.17501v1 Announce Type: new Abstract: In this paper, we study the Schrödinger-type operator associated with geometric stable processes on $\mathbb{R}^{d}$, especially the differentiability of spectral function. Let $\mathcal{H}$ be the generator of the geometric stable process and $\mu$ a smooth measure on $\mathbb{R}^{d}$. Then the spectral function $C(\theta)$ is defined as $C(\theta) = -\inf \sigma(-\mathcal{H} - \theta \mu)$, where $\sigma(\mathcal{A})$ denotes the spectrum of $\mathcal{A}$ and $\theta$ is a real parameter. Since the geometric stable process exhibits severe local singularities in its Lévy measure, its transition semigroup lacks ultracontractivity, which invalidates classical methods for proving the differentiability. To overcome this obstacle, we use the compact embedding of the extended Dirichlet space into $L^2(\mu)$. As a primary application of this differentiability, we establish a large deviation principle for a positive continuous additive functional associated with the smooth measure $\mu$.

19.
arXiv (CS.LG) 2026-06-12

QoS Improvement in Multi User Cellular-Symbiotic Radio Network Assisted by Active-STAR-RIS

arXiv:2401.08301v2 Announce Type: replace-cross Abstract: In this article, we employ active simultaneously transmitting and reflecting reconfigurable intelligent surfaces (ASRIS) to enhance the quality of 6G cellular network services. The network integrates commensal symbiotic radio (CSR) subsystems to facilitate communication between passive Internet of Things (IoT) users and active users, referred to as symbiotic backscatter devices (SBDs) and symbiotic user equipments (SUEs), respectively. Since the SBDs are passive, transmitting information to the SUEs poses significant challenges. To overcome this challenge, we harness the capabilities of massive multiple input multiple output (MIMO) antennas within the base station (BS) to relay the information transmitted by SBDs with greater power. This scheme uses the non-orthogonal multiple access (NOMA) technique for multiple access among all users, and potential interferences are eliminated using successive interference cancellation (SIC). The primary objective is to maximize the throughput between SBDs and SUEs. To achieve this, we formulate an optimization problem involving variables such as active beamforming coefficients at the BS and ASRIS, phase adjustments of ASRIS, and scheduling parameters between CSR and cellular networks. To solve this optimization problem, we used three deep reinforcement learning (DRL) methods: proximal policy optimization (PPO), twin delayed deep deterministic policy gradient (TD3), and asynchronous advantage actor critic (A3C). These methods were simulated, and the results demonstrate that A3C, TD3, and PPO have the best convergence speeds and achieve the highest increases in network throughput, respectively. Finally, the proposed scheme was evaluated using passive simultaneously transmitting and reflecting RIS (STAR-RIS), which demonstrated poorer performance compared to ASRIS.

20.
arXiv (CS.AI) 2026-06-17

S4oP: Operator-level Pruning of Structured State Space Models for Resource-Constrained Devices

arXiv:2606.18096v1 Announce Type: cross Abstract: Structured State Space Models (SSMs), including the S4 and S4D architectures, have recently emerged as powerful alternatives to attention-based models for capturing long-range dependencies in sequential data. Despite their strong empirical performance, deploying these models in time- and resource-constrained settings remains challenging due to their computational and memory demands. In this paper, we propose a novel incremental, operator-level pruning approach for S4- and S4D-based models that significantly reduces inference cost while preserving predictive performance. To the best of our knowledge, this is the first work to systematically investigate structured operator pruning for SSMs. Our method progressively prunes model operators by interleaving structured masking with fine-tuning, while jointly monitoring accuracy and inference latency. We implement this approach within a unified training and evaluation framework that enables systematic exploration of efficiency-accuracy trade-offs. Experiments across multiple benchmark datasets show that pruning up to 70% of the model operators preserves the performance of the original models in most cases, while substantially reducing inference latency. These results demonstrate that structured operator pruning is an effective and previously unexplored strategy for improving the efficiency of SSMs and facilitate their deployment in practical, resource-constrained scenarios.

21.
arXiv (CS.LG) 2026-06-12

Physics-Informed Neural Networks and Radial Basis Functions for PDEs with Dirac Delta Sources

arXiv:2606.12735v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) are a machine learning method for solving forward and inverse Partial Differential Equations (PDEs). When applied to PDEs with Dirac delta functions in the forcing terms, boundary conditions, or initial conditions, PINNs require approximating them with smooth surrogate functions, a practice that can introduce significant modeling errors. In this work, we exploit the interpretation of PINNs as Residual Least Squares (RLS) methods and show that this perspective enables direct treatment of Dirac delta terms by integrating the weak-form equation. Among RLS formulations other than PINN, we focus on the Radial Basis Function (RBF) expansion (also known as a single-layer RBF Network). We show that while integrating out the Dirac delta in PINNs causes residuals to fail to converge to zero, RBF-RLS consistently provides good forward and inverse solutions to transport problems. We explain this finding using the Neural Tangent Kernel (NTK) theory. We test both approaches on linear PDEs that represent groundwater flow and transport in porous media and rivers. We solve inverse problems to fit synthetic data, noisy synthetic data, and real-world measurements.

22.
arXiv (CS.CV) 2026-06-24

Hybrid Event Frame Sensors: Modeling, Calibration, and Simulation

Hybrid event-frame sensors integrate an Event Vision Sensor (EVS) and an Active Pixel Sensor (APS) within a single chip, combining the high dynamic range and low latency of the EVS with the rich spatial intensity information from the APS. While this tight integration offers compact and temporally precise imaging, the complex circuit architecture introduces nontrivial noise patterns that remain poorly understood and unmodeled. In this work, we present the first unified statistics-based imaging noise model that jointly describes the noise behavior of APS and EVS pixels. Our formulation explicitly incorporates photon shot noise, dark current noise, fixed-pattern noise, and quantization noise, and links EVS noise to illumination level and dark current. Based on this formulation, we further develop a calibration pipeline to estimate noise parameters from real data and provide a detailed analysis of both APS and EVS noise behaviors. Finally, we propose H-ESIM, a statistically grounded simulator that generates RAW frames and events under realistic jointly calibrated noise statistics. Experiments on two hybrid sensors validate our model across multiple imaging tasks, including video frame interpolation and deblurring, demonstrating strong transfer from simulation to real data.

23.
arXiv (CS.LG) 2026-06-24

Subtyping patients with chronic disease using longitudinal BMI patterns

arXiv:2111.05385v3 Announce Type: replace Abstract: Obesity is a major health problem, increasing the risk of various major chronic diseases, such as diabetes, cancer, and stroke. While the role of obesity identified by cross-sectional BMI recordings has been heavily studied, the role of BMI trajectories is much less explored. In this study, we use a machine-learning approach to subtype individuals' risk of developing 18 major chronic diseases by using their BMI trajectories extracted from a large and geographically diverse EHR dataset capturing the health status of around two million individuals for a period of six years. We define nine new interpretable and evidence-based variables based on the BMI trajectories to cluster the patients into subgroups using the k-means clustering method. We thoroughly review each cluster's characteristics in terms of demographic, socioeconomic, and physiological measurement variables to specify the distinct properties of the patients in the clusters. In our experiments, the direct relationship of obesity with diabetes, hypertension, Alzheimer's, and dementia has been re-established and distinct clusters with specific characteristics for several of the chronic diseases have been found to be conforming or complementary to the existing body of knowledge.

24.
arXiv (CS.CV) 2026-06-15

Compressing Image Style Training into a Single Model Forward

Diffusion-based style transfer must balance inference efficiency with stylization fidelity. Adapter-based methods are efficient, but they inject style as an external condition and can either weaken reference-specific appearance or copy reference semantics into the generated image. Optimization-based personalization methods such as LoRA internalize style more effectively, but require a separate training process for every new style. We introduce i2L (image-to-LoRA), a framework that amortizes style LoRA training into a single forward pass. Given one or more reference images, i2L predicts LoRA weights for a text-to-image model, enabling immediate style instantiation without per-style optimization. The architecture combines an image encoder, learnable LoRA queries, and compressed decoding heads that generate adapted matrices. Training on semantically diverse style pairs encourages the predictor to preserve appearance cues while suppressing reference-content copying. Experiments on Z-Image, FLUX.2, and Hidream-O1 show that i2L improves style fidelity, prompt alignment, and perceptual quality over existing baselines. Because i2L produces explicit LoRA weights, it also supports asymmetric classifier-free guidance, multi-reference style fusion, and composition with controllable-generation modules.

25.
arXiv (CS.AI) 2026-06-24

Breaking Shortcut Learning for Cross-Trial EEG-Guided Target Speech Extraction via Two-Stage Training

arXiv:2606.24164v1 Announce Type: cross Abstract: Recent end-to-end models for EEG-guided target speech extraction report impressive results, underscoring potential for neuro-steered hearing technologies. However, our analysis reveals that high within-trial performance can be driven by trial-specific EEG structure that acts as shortcuts for target selection, leading to poor generalization on unseen trials. To overcome this gap, we propose TRUST-TSE, a two-stage framework to mitigate shortcut learning. By introducing contrastive pretraining with attended-speaker negative sampling, we encourage the EEG encoder to capture fine-grained EEG–speech alignment while suppressing trial-identity cues. We also employ a confidence-weighted extraction objective based on EEG–source similarity to guide extraction using the learned representations. Experiments on KUL and DTU datasets show that TRUST-TSE outperforms end-to-end baselines under strict cross-trial protocols, addressing a key reliability bottleneck of existing approaches.