Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-16

Scaling Limits of Bivariate Nearly-Unstable Hawkes Processes and Applications to Rough Volatility

arXiv:2605.03703v3 Announce Type: replace Abstract: We study a pair of nearly-unstable Hawkes processes coupled through a one-directional, or triangular, cross-excitation: the first component evolves autonomously and excites the second, but not conversely. Each component is self-exciting through a heavy-tailed memory kernel, and the two kernels are allowed to have different tail indices, so that the limiting components exhibit genuinely different degrees of roughness. As the system approaches criticality, we prove that the suitably rescaled intensity vector converges weakly to the unique solution of a coupled system of stochastic Volterra equations of rough-volatility type. The first limiting component is autonomous, while the second is driven both by its own noise and by an inherited noise transmitted from the first component through an effective cross-kernel. This cross-kernel is the convolution of the two limiting Mittag-Leffler kernels and therefore combines the two memory structures. As a consequence, we obtain a short-time cross-decorrelation law: although the two components are coupled, their functional correlation vanishes at small time scales at an explicit polynomial rate. This time-dependent correlation distinguishes the limit from independent rough processes and from classical bivariate rough models with constant Brownian correlation.

02.
arXiv (CS.LG) 2026-06-12

GEMSS: A Variational Bayesian Method for Discovering Multiple Sparse Solutions in Classification and Regression Problems

arXiv:2602.08913v2 Announce Type: replace Abstract: High-dimensional, underdetermined and highly correlated systems are common in data science practice, especially when analyzing physical measurements. In such settings, feature selection poses a fundamental challenge because multiple distinct sparse subsets may explain the response equally well. Their identification is crucial not only for predictive modeling but also for generating domain-specific insights into the underlying mechanisms. Yet, conventional methods typically isolate a single solution, obscuring the full spectrum of plausible explanations. This work introduces GEMSS (Gaussian Ensemble for Multiple Sparse Solutions), a variational algorithm designed to simultaneously discover multiple, diverse sparse feature combinations. The method employs a structured spike-and-slab prior for sparsity, a mixture of Gaussians to approximate the intractable multimodal posterior, and a Jaccard-based penalty to further control solution diversity. A single objective function is optimized via stochastic gradient descent. The method is tested on 128 comprehensive experiments by a novel benchmarking framework designed to generate artificial problems with multiple sparse solutions of equal predictive properties. This allows us to measure the retrieval of ground truth features rather than only evaluating predictive performance – characteristics more fitting to our practical needs. A comparative analysis shows that GEMSS consistently outperforms five prominent feature selection methods adapted through the ALFESE framework. Finally, we demonstrate practical usability through 3 challenging real-world datasets from metabolomics and physical chemistry: GEMSS successfully isolates multiple distinct yet quality solutions. GEMSS is available as a PyPI package 'gemss'. The corresponding repository github.com/kat-er-ina/gemss/ includes the full codebase and a free, no-code application GEMSS Explorer.

03.
arXiv (CS.AI) 2026-06-17

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3

arXiv:2606.17127v1 Announce Type: cross Abstract: Antimicrobial resistance causes to over a million deaths annually. Antimicrobial peptides (AMPs) are a promising solution, but generative AMP models are not yet ready to design peptides with non-natural amino acids and/or chemical modifications, which are essential for real-world peptide drugs. We present AMPGAN v3, a multi-objective conditional GAN that expands the generative vocabulary to D-amino acids and N/C-terminus modifications such as amidation. By separating adversarial and activity-aware supervision across two specialized discriminators, AMPGAN v3 substantially improves training stability and outperforms prior generative AMP models on external classifiers. We validated five candidates spanning three structural classes in vitro; two showed activity against Gram-positive strains, with the best candidate reaching MIC 8 {\mu}g/mL against B. subtilis. To support downstream curation, we further present PepCraft, a multi-agent framework for end-to-end AMP discovery in which a Planning Agent orchestrates specialized executors for generation, filtering, and verification. Its prioritization recommendations align with our in vitro outcomes. Together, these contributions let us examine, on a small but real scale, how generative and agentic AI compose in therapeutic peptide discovery. Code: https://github.com/marszzibros/AMPGANv3

04.
arXiv (quant-ph) 2026-06-17

Probing PbTe-Pb nanowire devices with radio-frequency reflectometry

arXiv:2606.04544v2 Announce Type: replace-cross Abstract: We report the implementation of radio-frequency (rf) reflectometry on selective-area-grown PbTe-Pb nanowire devices on a CdTe substrate. These nanowires are predicted to host Majorana zero modes. We demonstrate the compatibility of the rf technique, including both resistive and capacitive sensing, with these nanowires. The effect of dielectric loss from the CdTe substrate is quantitatively characterized. Furthermore, the feasibility of rf reflectometry is verified under finite magnetic fields where zero-energy modes can emerge. Our results establish the fast control of PbTe quantum devices, paving the way for their applications in topological quantum computation.

05.
arXiv (CS.LG) 2026-06-11

Machine-learning-based multipoint optimization of fluidic injection parameters for improving nozzle performance

arXiv:2409.12707v2 Announce Type: replace-cross Abstract: Fluidic injection offers a promising solution to improve the performance of the overexpanded single expansion ramp nozzles (SERNs) during vehicle acceleration. However, determining the injection parameters that yield the best overall performance across multiple nozzle operating conditions remains a challenge. The gradient-based optimization method requires gradients of injection parameters at each design point, which can lead to high computational costs when using computational fluid dynamics (CFD) simulations. This paper uses a pretrained neural network to replace CFD during optimization, enabling quick calculation of the nozzle flow field at multiple design points. Considering the physical characteristics of the nozzle flow field, a prior-based prediction strategy is adopted to enhance the model's accuracy. In addition, the neural network's back-propagation algorithm computes gradients quickly by running the computation only once, thereby greatly reducing gradient computation time compared to the finite difference method. As a test case, the average nozzle thrust coefficient of an SERN at seven design points is optimized, resulting in a 1.14\% improvement. The time cost is greatly reduced compared with traditional optimization methods, even when the time required to establish the training database is included.

06.
arXiv (CS.CL) 2026-06-17

Translating the Untranslatable: An Operationalizable Ontology for Untranslatability

Untranslatability, cases where meaning cannot be directly preserved across languages, is well-studied in linguistics but underexplored in NLP. As machine translation (MT) systems improve on standard benchmarks, their limitations increasingly concentrate in such cases, where translation cannot be reduced to one-to-one equivalence. We introduce a structured ontology of untranslatability along with a taxonomy of compensation strategies, which are specific techniques to convey meaning under these untranslatable circumstances. We operationalize this framework into a multilingual dataset of untranslatable sentences paired with strategy-based translations, enabling controlled analysis of translation behavior. Initial human preference studies suggest that translation quality depends on the strategy used, with consistent preferences for outputs that include explanatory context, known as the Annotation compensation strategy. Our framework and dataset provide a foundation for studying and modeling strategy-informed machine translation.

07.
arXiv (quant-ph) 2026-06-19

Recursive perturbation approach to time-convolutionless master equations: Explicit construction of generalized Lindblad generators for arbitrary open systems

arXiv:2506.04095v2 Announce Type: replace Abstract: We develop a recursive perturbative expansion for the time-convolutionless (TCL) generator of an open quantum system in a generalized Lindblad form. This formulation provides a systematic approach to derive the generator at arbitrary order while preserving a Lindblad-like structure, without imposing assumptions on the system or environment beyond an initially uncorrelated state. The generator is written, at all orders, in a canonical form, which also corresponds to the minimal dissipation condition, which uniquely specifies the decomposition of the generator into Hamiltonian and dissipative contributions. To validate the method and show its effectiveness in addressing non-Markovian dynamics and strong-coupling effects, we compute the generator explicitly up to fourth order.

08.
arXiv (CS.AI) 2026-06-19

Hybrid ANN-SNN Pipeline with Local Plasticity

arXiv:2606.20151v1 Announce Type: cross Abstract: This work proposes a hybrid ANN-SNN pipeline that effectively leverages the rich embeddings of pretrained artificial neural networks (ANNs) to enable high-performance spiking neural networks (SNNs). The architecture couples a pretrained EfficientNet encoder with a CoLaNET spiking classifier. We convert the encoder's activations into spike trains via rate-coding and train the subsequent SNN classifier using local, biologically inspired learning rules, bypassing end-to-end gradient propagation. This approach achieves 99.09% accuracy on a 64-class ImageNet benchmark, demonstrating performance on par with conventional deep networks. The work presents a biologically plausible and efficient framework for adapting powerful pretrained encoders to downstream spiking neural network tasks.

09.
arXiv (CS.CL) 2026-06-12

X-MADAM-RAG: Diagnosing and Handling Chinese-English Evidence Conflict in Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) systems may receive evidence that is not merely noisy but mutually contradictory. This issue becomes particularly salient in multilingual settings, where retrieved Chinese and English evidence may support incompatible answer candidates. We study this problem through X-RAMDocs-ZHEN, a controlled Chinese-English benchmark derived from RAMDocs for diagnosing evidence conflict in RAG. The benchmark contains 300 examples across six balanced conditions, including monolingual support, bilingual agreement, reversed conflict directions, and conflict with optional noise. We further examine X-MADAM-RAG, an interpretable pipeline that decomposes evidence handling into per-document candidate extraction, visible-evidence repair, deterministic candidate grouping, and conflict-aware aggregation. On the original controlled benchmark with Qwen2.5-7B-Instruct, X-MADAM-RAG achieves 0.9667 strict accuracy and 0.9767 conflict-aware success, outperforming an evidence-normalized single-call baseline. However, a zero-call rule-only extractor reaches 1.0000 on the same benchmark, revealing strong template regularity. To probe this limitation, we construct a deterministic naturalized stress test that removes explicit answer templates while preserving candidate strings. On its 100-sample subset, rule-only extraction falls to 0.0000, but X-MADAM-RAG also drops to 0.3000 strict accuracy, below both naive and evidence-normalized baselines. A privileged oracle remains perfect, indicating that document-level extraction is the main bottleneck. These findings position X-RAMDocs-ZHEN and X-MADAM-RAG as diagnostic tools for controlled evidence conflict rather than as evidence of general hallucination detection or robustness to natural retrieval.

10.
arXiv (CS.CL) 2026-06-16

Modeling Sarcastic Speech: Semantic and Prosodic Cues in a Speech Synthesis Framework

Sarcasm is a pragmatic phenomenon in which speakers convey meanings that diverge from literal content, relying on an interaction between semantics and prosodic expression. However, how these cues jointly contribute to the recognition of sarcasm remains poorly understood. We propose a computational framework that models sarcasm as the integration of semantic interpretation and prosodic realization. Semantic cues are derived from an LLaMA 3 model fine-tuned to capture discourse-level markers of sarcastic intent, while prosodic cues are extracted through semantically aligned utterances drawn from a database of sarcastic speech, providing prosodic exemplars of sarcastic delivery. Using a speech synthesis testbed, perceptual evaluations show that semantic and prosodic cues enhance perceived sarcasm, with the combined system achieving the best downstream F1 while maintaining high subjective sarcasm ratings. These findings highlight the complementary roles of semantics and prosody in pragmatic interpretation and illustrate how modeling can shed light on the mechanisms underlying sarcastic communication.

11.
arXiv (CS.LG) 2026-06-17

ConTex: Reformulating Counterfactual Generation For Time Series Forecasting

arXiv:2606.18049v1 Announce Type: new Abstract: Decision-making with deep learning-based time series forecasting requires not only accurate predictions but also actionable insights. However, current architectures do not inherently provide such information. Specifically, guidance is needed on how current conditions must be modified to shift from a predicted outcome to a desired future scenario. Counterfactual explanations provide a natural framework for this task, as they represent minimal input changes that alter the model's prediction, indicating when and how intervention is required. Existing approaches rely on instance-wise optimization, leading to inconsistency across instances, high computational costs, and limited applicability in real-time settings. To address these limitations, we reformulate counterfactual generation for time series forecasting as the problem of learning a globally consistent intervention strategy, allowing counterfactuals to be generated through a single shared function. We propose Counterfactual Time Series Explanations (ConTex), a model-agnostic, decomposed architecture comprising a temporal context encoder and a conditional encoder, followed by two heads that capture interventions in terms of temporal relevance and modification strength. This structure overcomes the instability and inconsistency of instance-based approaches by producing targeted, interpretable interventions across time and feature dimensions in a single forward pass, making it suitable for real-time applications. Across multiple forecasting architectures and benchmark datasets, ConTex achieves state-of-the-art validity while generating sparse counterfactuals that minimize the number of necessary interventions. Additionally, our approach reduces computational cost by at least 12-36x compared to instance-wise generation and supports real-time inference at approximately 0.007 seconds.

12.
arXiv (math.PR) 2026-06-19

Hermite trace polynomials and chaos decompositions for the Hermitian Brownian motion

arXiv:2207.13180v4 Announce Type: replace Abstract: For a non-zero parameter $q$, we define Hermite trace polynomials, which are multivariate polynomials indexed by permutations. We prove several combinatorial properties for them, such as expansions and product formulas. The linear functional determined by these trace polynomials is a state for $q = \frac{1}{N}$ for $N$ a non-zero integer. For such $q$, Hermite trace polynomials of different degrees are orthogonal. The product formulas extend to the closure with respect to the state. The state can be identified with the expectation induced by the $N \times N$ Hermitian Brownian motion. Hermite trace polynomials are martingales for this Brownian motion, while the elements in the closure can be interpreted as stochastic integrals with respect to it. Using the grading on the algebra, we prove several chaos decompositions for such integrals, as well as analyze corresponding creation and annihilation operators. In the univariate, pure trace polynomial case, trace Hermite polynomials can be identified with the Hermite polynomials of matrix argument.

13.
arXiv (CS.CL) 2026-06-17

Compositional Skill Routing for LLM Agents: Decompose, Retrieve, and Compose

Authors:

LLM agents increasingly rely on external skills – reusable tool specifications – but real-world tasks often require composing multiple skills, not just selecting one. We formalize this as the Compositional Skill Routing problem: given a complex user query and a large skill library, decompose the query into atomic sub-tasks, retrieve the appropriate skill for each sub-task, and compose an executable plan. We present SkillWeaver, a decompose-retrieve-compose framework combining an LLM task decomposer, a bi-encoder skill retriever with FAISS indexing, and a dependency-aware DAG planner. To support evaluation, we introduce CompSkillBench, a benchmark of 300 compositional queries over 2,209 real MCP server skills spanning 24 functional categories, sourced from the public MCP ecosystem. Our experiments reveal that task decomposition quality is the primary bottleneck: standard LLM decomposition reaches only 34.2% category recall at the step level. To address this, we propose Iterative Skill-Aware Decomposition (SAD), a retrieval-augmented feedback loop that iteratively aligns decomposition with available skills. SAD improves decomposition accuracy from 51.0% to 67.7% (+32.7%, Wilcoxon p < 10^-6) in a single iteration; DA-conditioned analysis confirms that correct granularity is the prerequisite for effective retrieval (CatR@1 rises from 34% to 41% when DA=1). SkillWeaver reduces context window consumption by over 99%, and transfer experiments confirm generalization (+35.6% relative DA gain even when target categories are absent from the retrieval pool).

14.
arXiv (CS.LG) 2026-06-16

Drivers, Receivers, and Dynamic Linkages: The Directed Structure of SDG Interdependence, 2000–2024

arXiv:2601.20875v2 Announce Type: replace-cross Abstract: Governments with limited fiscal and administrative capacity need to know which Sustainable Development Goals (SDGs) propagate progress through the goal system and how quickly. We map the directed interdependence structure of all seventeen goals using a balanced panel of 114 countries observed annually from 2000 to 2024. The goal series are persistent, trending, and cross-sectionally dependent, so we apply two estimators matched to this regime: a Dumitrescu-Hurlin panel Granger non-causality test, run on first-differenced series, to recover the directed interaction network, and panel local projections with Driscoll-Kraay standard errors to measure the dynamic magnitude of 31 theory-derived indicator linkages. Of 272 directed goal pairs, 84 linkages survive false-discovery control (40 synergies, 44 trade-offs; network density 0.31). Synergies and trade-offs occur at comparable strength, so no single goal behaves as a universal accelerator, and the goal-level hierarchy itself is fragile. Driver-receiver rankings correlate weakly across lag orders and centrality metrics, and under a country bootstrap only two roles are distinguishable from zero: peace and strong institutions as the clearest net receiver, and poverty reduction as the most probable effect-size-weighted driver. The supported linkages are dynamic, accruing over four to five years: sanitation and poverty improvements are the strongest predictors of lower child mortality, and the education-child-health association is corroborated in independent World Development Indicators data across 183 countries. These results caution against rankings-based accelerator policy and support adaptive portfolios built on supported, time-lagged linkages monitored through constituent indicators.

15.
arXiv (quant-ph) 2026-06-12

Accidental Symmetry in the Tavis-Cummings Model via the Schwinger Boson Representation

arXiv:2606.12813v1 Announce Type: new Abstract: The Jaynes-Cummings (JC) Hamiltonian is a paradigmatic model of light-matter interaction and, more generally, qubit-boson interactions, widely used across atomic, optical, and superconducting qubit platforms. In the multi-qubit setting, where n qubits are identically coupled to a single boson mode, this interaction is known as the Tavis-Cummings (TC) Hamiltonian. The structure of the TC model is usually understood in terms of two standard symmetries: permutation invariance of the qubits and a U(1) symmetry associated with conservation of the total excitation number. Here we identify an additional, independent "accidental" symmetry of the TC Hamiltonian and construct the corresponding conserved observable. We show that, for n>2 qubits, this symmetry imposes strong constraints on the realizable unitary transformations. These constraints persist in the presence of the global $J_z$ Hamiltonian, but are removed by adding $J_z^2$, even though $J_z^2$ preserves both permutation invariance and the U(1) symmetry. Finally, we explain the origin of this previously unnoticed symmetry using Schwinger's boson representation of angular momentum. These restrictions have important implications for controllability of the TC system and for its applications to quantum computing, which are investigated further in a companion paper.

16.
arXiv (CS.CV) 2026-06-24

MapReason-OSM: Can Vision-Language Models Make Graph-Verifiable Mobility Decisions from Street Maps ?

Vision-language models (VLMs) are increasingly used to read maps for logistics, delivery, and accessible navigation, where the output is an actionable decision (a route, a pin, a parking choice) that must respect the road network. Yet most map benchmarks grade free text or multiple-choice answers that cannot be verified against the underlying graph. We present MapReason-OSM, a benchmark and evaluation harness for graph-verifiable mobility decisions on self-rendered OpenStreetMap panels. We render fixed-style maps for ten U.S. downtowns at two aligned zoom scales, overlay a consistent marker grammar, and pair each panel with a hidden street graph and exact oracles, yielding 6,000 instances (12,000 panels across the two zooms) over 12 routing, facility-location, and visual disambiguation tasks. Models return structured decisions that we snap back to the graph and score for validity, legality, optimality, and constraint satisfaction, plus cross-zoom consistency. Across seven VLMs, models read maps and route simply but fail at graph cost reasoning (single-facility pin placement is near chance even for frontier reasoning models), and are frequently scale-inconsistent. We release the benchmark, harness, and deterministic generator. Code and data: https://github.com/Vi-Sri/mapreason-osm

17.
medRxiv (Medicine) 2026-06-17

Identifying anaphylaxis using weakly-supervised prediction models and natural language processing

Objectives Scalable computable phenotyping algorithms are critical for conducting high-throughput disease-outcome research in large, distributed-data electronic health record (EHR) and claims data settings. We developed and evaluated a claims- and EHR-based computable phenotyping algorithm for anaphylaxis, a rare acute condition that is challenging to accurately identify using claims data alone. Materials and Methods Potential anaphylaxis events came from two healthcare systems (Kaiser Permanente Washington [KPWA] and Vanderbilt University Medical Center [VUMC]). We engineered features from clinical text using automated natural language processing (NLP) methods. We then developed a phenotyping algorithm using four NLP- and diagnosis code-based silver labels (proxies for the gold-standard labels). Gold-standard abstracted outcomes were used to evaluate algorithm performance. Results The largest area under the receiver operating characteristic curve (AUC) was 0.931 for an NLP-based silver-label model at KPWA. Depending on the model and healthcare system site, positive predictive value (PPV) and sensitivity at the threshold of predicted probability that maximized F1 score ranged from 0.52 to 0.77 (PPV) and 0.78 to 1 (sensitivity). Discussion NLP-based silver-label models had large AUC at KPWA but not at VUMC. This may be because clinical text at KPWA is only available for outpatient encounters and secure messaging. High sensitivity for identifying anaphylaxis can be obtained using our best-performing models. Conclusion The best-performing models had better PPV and sensitivity tradeoffs than prior bespoke anaphylaxis models with costly, manually curated features. The simplicity of the approach compared to traditional phenotyping methods allows it to be deployed easily at multiple health care systems.

18.
arXiv (CS.CL) 2026-06-24

Aspect-Based Sentiment Evolution and its Correlation with Review Rounds in Multi-Round Peer Reviews: A Deep Learning Approach

Mining sentiment information from the textual content of peer review comments offers valuable insights into the scientific evaluation process. However, previous studies are often constrained by coarse-grained analysis and the lack of differentiation across review rounds. Notably, the dynamic shifts in reviewers' focus and sentiment tendencies throughout multiple review stages remain underexplored. To address this gap, the present study investigates the distribution and evolution of aspect-level sentiments and examines their correlation with the number of review rounds. We begin by segmenting the multi-round review comments of 11,063 accepted papers from Nature Communications and identifying fine-grained review aspect clusters. A manually annotated corpus of approximately 5,000 review sentences is then constructed. Using this dataset, we train a series of deep learning-based aspect sentiment classification models. Among them, the LCF-BERT-CDM model achieves the best performance, with a Macro-F1 score of 82.65%. Subsequent statistical analysis reveals a consistent trend: as the number of review rounds increases, the proportion of positive sentiments rises, while negative sentiments decline. Correlation analysis further indicates that aspect sentiment scores are negatively associated with the total number of review rounds. Key aspects exhibiting stronger correlations include "experiments", "research significance" and "result analysis".

19.
arXiv (CS.AI) 2026-06-16

Do we have the knowledge we need? Rethinking human-AI decision-making in corporations

arXiv:2606.15575v1 Announce Type: new Abstract: Organizational knowledge is fragmented across a variety of software systems, tacit expertise, and manual documents that have traditionally been designed for human consumption. As AI systems are increasingly deployed and granted decision-making roles, they require access to this knowledge. This raises two questions: how should organizations store and maintain knowledge so that it remains accessible to both humans and future AI systems, and how should agency be allocated between humans and AI across tasks with different risks and levels of uncertainty? In this position paper, we describe how organizational knowledge evolves and contribute a framework that maps task attributes and knowledge availability to recommended agency allocations and control mechanisms. We illustrate the applicability of the framework on two different manufacturing tasks: a routine operation (visual quality inspection) and a one-off strategic decision (factory location), and conclude with opportunities for future research.

20.
arXiv (quant-ph) 2026-06-24

Quantum Adaptive Self-Attention for Quantum Transformer Models

arXiv:2504.05336v4 Announce Type: replace Abstract: A recurring weakness in quantum machine learning (QML) is that reported ``quantum advantages'' are seldom tested against a capacity-matched classical control, leaving it unclear whether a gain comes from the quantum substrate or from the architectural change that accompanies it. Our primary contribution is methodological: a protocol for attributing such gains honestly – a capacity-matched classical bottleneck of identical parameter budget, transparent reporting of where quantum does not help, and validation on real quantum hardware – which we develop and apply through a concrete case study. That case study is Quantum Adaptive Self-Attention (QASA), a hybrid Transformer that replaces the value projection of a single encoder layer with a 36-parameter parameterized quantum circuit (PQC), keeping all other layers classical. Across nine synthetic benchmarks and the real-world ETTh1 dataset, QASA improves on a full-capacity classical Transformer for chaotic and trend-dominated signals. To ask whether this is a genuinely quantum effect, we introduce a control rarely applied in quantum machine learning – a capacity-matched classical bottleneck with the same parameter budget – and find that it matches the PQC on the error metrics. The gain is therefore attributable to the low-rank value-projection bottleneck (an architectural parsimony principle), not to quantumness; adding further quantum layers only degrades performance and trainability. We accordingly position the quantum layer not as a source of accuracy advantage but as a competitive instantiation of this principle: its low-rank compression onto the signal's intrinsic dimensionality is matched by a classical bottleneck, so the gain is architectural rather than quantum.

21.
arXiv (CS.AI) 2026-06-17

SSIL: Self-Supervised Imitation Learning for End-to-End Driving

arXiv:2308.14329v4 Announce Type: replace-cross Abstract: In autonomous driving, the end-to-end (E2E) driving approach that predicts vehicle control signals directly from sensor data is rapidly gaining attention. To learn a safe E2E driving system, one needs an extensive amount of driving data and human intervention. Vehicle control data is constructed by many hours of human driving, and it is challenging to construct large vehicle control datasets. Often, publicly available driving datasets are collected with limited driving scenes, and collecting vehicle control data is only available by vehicle manufacturers. To address these challenges, this paper proposes the first self-supervised learning framework, Self-Supervised Imitation Learning (SSIL), for E2E driving. The proposed SSIL framework can learn vision-based E2E driving networks without using driving command data or a pre-trained model. To construct pseudo steering angle data, proposed SSIL predicts a pseudo target from the vehicle's poses at the current and previous time points that are estimated with light detection and ranging sensors. In addition, we propose a new cross-attention-based conditioning approach (CACA) for a vision encoder in E2E driving, where a high-level instruction serves as the conditioning signal for visual information. Our numerical experiments with three different benchmark datasets demonstrate that the proposed SSIL framework achieves very comparable E2E driving accuracy with the supervised learning counterpart. Furthermore, the proposed pseudo-label predictor outperformed an existing one using proportional integral derivative controller, and proposed CACA achieved superior performance over existing conditioning approaches.

22.
medRxiv (Medicine) 2026-06-10

Exploratory Assessment of Pulsed-Wave Doppler Representations of Lung Sounds Using Deep Learning: An In-Vitro Phantom Study

The increasing availability of portable ultrasound systems motivates exploration of novel approaches to respiratory signal assessment. In this in-vitro study, we investigate whether pulsed-wave (PW) Doppler ultrasound can capture structured spectral patterns from replayed lung sound recordings. Digitized respiratory sounds were replayed through a tissue-mimicking ultrasound phantom, generating 1,478 PW Doppler spectral images from recordings associated with healthy subjects and several externally labeled disease categories. Exploratory classification experiments using a ResNet-18 architecture demonstrated that these Doppler representations contain learnable differences under controlled conditions. These findings motivate further investigation into PW Doppler as a potential representation of respiratory acoustics.

23.
arXiv (CS.LG) 2026-06-17

Searching Neural Architectures for Sensor Nodes on IoT Gateways

arXiv:2505.23939v2 Announce Type: replace Abstract: This paper presents an automatic method for the design of Neural Networks (NNs) at the edge, enabling Machine Learning (ML) access even in privacy-sensitive Internet of Things (IoT) applications. The proposed method runs on IoT gateways and designs NNs for connected sensor nodes without sharing the collected data outside the local network, keeping the data in the site of collection. This approach has the potential to enable ML for Healthcare Internet of Things (HIoT) and Industrial Internet of Things (IIoT), designing hardware-friendly and custom NNs at the edge for personalized healthcare and advanced industrial services such as quality control, predictive maintenance, or fault diagnosis. By preventing data from being disclosed to cloud services, this method safeguards sensitive information, including industrial secrets and personal data. The outcomes of a thorough experimental session confirm that – on the Visual Wake Words dataset – the proposed approach can achieve state-of-the-art results by exploiting a search procedure that runs in less than 10 hours on the Raspberry Pi Zero 2.

24.
arXiv (CS.AI) 2026-06-16

PO-PDDL: Learning Symbolic POMDPs from Visual Demonstrations for Robot Planning Under Uncertainty

arXiv:2606.15654v1 Announce Type: cross Abstract: Real-world robot task planning must operate under both stochastic action execution and partial observability, yet constructing Partially Observable Markov Decision Process (POMDP) models for real robotics domains remains difficult and labor-intensive. We introduce PO-PDDL, a symbolic formulation of POMDPs that preserves the relational structure and LLM-friendly syntax of the Planning Domain Definition Language (PDDL), while explicitly modeling partial observability, stochasticity, and beliefs. Building on this formulation, we propose a demonstration-driven pipeline for learning PO-PDDL models. The proposed method reconstructs latent symbolic state trajectories from real-robot execution videos, identifies partial observability via inconsistencies between inferred states and visual observations, and learns stochastic transition and observation models accordingly. The resulting PO-PDDL domains are reusable across tasks and enable online belief-space planning under both perception and execution uncertainty. Experiments on real-world long-horizon manipulation tasks show that our method consistently outperforms existing PDDL and POMDP model-learning approaches, achieving robust task planning under uncertainty with significantly lower planning cost.

25.
arXiv (CS.LG) 2026-06-24

Parallel Manifold Steering: Efficient Adaptation of Large Associative Memories via Residual Energy Shaping

arXiv:2606.24396v1 Announce Type: new Abstract: Large Transformer models function as Dense Associative Memories (DAMs), retrieving knowledge via high-dimensional attractor dynamics driven by the self-attention mechanism \citep{ramsauer2020hopfield, wu2024attention}. However, adapting these frozen memory systems to new tasks presents a fundamental ``Plasticity-Stability'' dilemma. Current methods either risk catastrophic interference by modifying synaptic weights directly (e.g., LoRA) \citep{hu2021lora} or degrade associative capacity by clogging the retrieval buffer with static prompt tokens (e.g., VPT) \citep{jia2022vpt}. In this work, we propose H-Res (Hierarchical Residual Steering), a mechanism that modulates the effective energy landscape of the Transformer without altering its global equilibrium or expanding its sequence length. By formulating adaptation as a control problem on the activation manifold \citep{chen2018neuralode}, H-Res learns a state-dependent vector field that steers token trajectories into task-specific basins of attraction. We formally prove that H-Res preserves the attention entropy of the foundation model and facilitates Neural Collapse \citep{papyan2020prevalence}. Empirically, Manifold Steering outperforms global weight modification by 26\% on associative retrieval tasks and eliminates the computational overhead of prompt-based methods, scaling effectively to structured domains \citep{zha2023vtab}.