Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-16

Information Leakage Detection through Approximate Bayes-optimal Prediction

arXiv:2401.14283v4 Announce Type: replace-cross Abstract: In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to binary system sensitive information and lack a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor's log-loss and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.

02.
arXiv (CS.CL) 2026-06-11

A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries

How much meaning does a text carry? Shannon's theory measures uncertainty over symbols and is intentionally indifferent to meaning, while pairwise metrics such as BERTScore compare two texts rather than characterizing one. We develop a geometric framework that measures semantic content from the structure of a text's sentence embeddings. The framework has three parts. First, within a fixed embedding and baseline, six natural axioms uniquely determine a scalar measure up to scale, a frame-conditional uniqueness theorem. The resulting scalar is empirically too coarse, motivating a richer representation. Second, we propose a three-coordinate semantic profile capturing novelty (displacement from generic discourse), breadth (diversity of distinct ideas), and integration (connectedness among them), together with a discrete minimal unit (the semantic quantum) whose resolution is fixed by a clustering threshold $\tau$. Third, we prove a no-go theorem: no scalar summary of the profile can simultaneously satisfy analytic stability under paraphrase and concatenation, ordinal robustness across text scales, and cross-representation comparability. We exhibit two practical scalars, $S_{\mathrm{minmax}}$ and $S_{\mathrm{rank}}$, each occupying a distinct corner of this trade-off triangle. Validation across 23 synthetic categories, 5 Project Gutenberg novels, and 3 embedding models confirms the trade-off. The recommended rank-normalized configuration passes 25 of 28 ordinal checks as point estimates (21 of 28 after Benjamini-Hochberg correction), outperforming seven baselines including unigram entropy and a BERTScore-based novelty signal. A separate variational result connects the breadth coordinate to the log-determinant of a determinantal point process (Spearman $\rho = 0.985$ over 507 Gutenberg chapters), giving an optimization-theoretic foundation for breadth.

03.
arXiv (math.PR) 2026-06-16

Exponential Convengence of DLRA for SDEs

arXiv:2606.15843v1 Announce Type: new Abstract: We study dynamical orthogonal (DO) approximations of stochastic differential equations and investigate their long-time behaviour. The DO formulation represents the solution by a low-rank decomposition and leads to a coupled system consisting of an evolution equation on the Stiefel manifold and a reduced stochastic process. We establish the well-posedness of the strong DO system and derive quantitative error estimates between the original stochastic differential equation and its low-rank approximation in the Wasserstein distance. Our main contribution is the analysis of invariant probability measures for the DO dynamics. Under suitable dissipativity, Lipschitz continuity, and non-degeneracy assumptions on the coefficients, we prove the existence of an invariant probability measure for the strong DO system. The proof combines uniform moment estimates, a Krylov–Bogoliubov argument for an associated frozen system, and a Kakutani-Fan-Glicksberg fixed-point theorem to recover the self-consistent dynamics. We further show that the induced low-rank process admits an invariant probability measure and discuss the structure of invariant measures through several illustrative examples. These results provide a rigorous foundation for the use of dynamical low-rank approximations in the approximation of long-time statistical properties of stochastic dynamical systems.

04.
arXiv (quant-ph) 2026-06-12

Testing the problem of time with cold atoms

arXiv:2509.07745v3 Announce Type: replace-cross Abstract: We realize a cold-atom system to quantitatively test relational constructions of time. A well-isolated atomic Bose-Einstein condensate evolves in a conservative trap that is partitioned by a thin optical barrier into an observed and unobserved sector, with negligible dissipation on the experimental timescale. Motivated by relational-time approaches discussed in the Wheeler-DeWitt framework, we ask whether the dynamics of the observed sector can be ordered using only internal degrees of freedom. To this end, we construct an entropic time from an experimentally defined coarse-grained entropy, and demonstrate that it can robustly order the events in the observed sector across repeated cycles of expansion and recollapse. We finally derive an effective Schroedinger equation parameterized by this internal time and show that it is able to reproduce the measured evolution. These results establish a controlled experimental setting in which relational-time constructions can be quantitatively tested.

05.
arXiv (CS.AI) 2026-06-19

Leveraging systems' non-linearity to tackle the scarcity of data in the design of Intelligent Fault Diagnosis Systems

arXiv:2606.20323v1 Announce Type: new Abstract: Deep Transfer Learning (DTL) allows for the efficient building of Intelligent Fault Diagnosis Systems (IFDS). On the other hand, DTL methods still heavily rely on large amounts of labelled data. Obtaining such an amount of data can be challenging when dealing with machines or structures faults. This document proposes a novel approach to the design of vibration-based IFDS using DTL in condition of strong data scarcity. A periodic multi-excitation level procedure leveraging intrinsic non-linearities of real-world systems is used to produce images that can be conveniently analysed by pre-trained Convolutional Neural Networks (CNNs) to diagnose faults. A new data visualization method and its augmentation technique are proposed in this paper to tackle the typical lack of data encountered during the design of IFDS. Experimental validation on a railway pantograph structure provides effective support for the proposed method.

06.
arXiv (CS.AI) 2026-06-16

Computational Safety for Generative AI: A Hypothesis Testing Perspective

作者:

arXiv:2502.12445v2 Announce Type: replace Abstract: AI safety is a rapidly growing area of research that seeks to prevent the harm and misuse of frontier AI technology, particularly with respect to generative AI (GenAI) tools that are capable of creating realistic and high-quality content through text prompts. Examples of such tools include large language models (LLMs) and text-to-image (T2I) diffusion models. As the performance of various leading GenAI models approaches saturation due to similar training data sources and neural network architecture designs, the development of reliable safety guardrails has become a key differentiator for responsibility and sustainability. This paper presents a formalization of the concept of computational safety, which is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI through the lens of signal processing theory and methods. In particular, we explore two exemplary categories of computational safety challenges in GenAI that can be formulated as hypothesis testing problems. For the safety of model input, we show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts. For the safety of model output, we elucidate how statistical signal processing can be used to detect AI-generated content. Finally, we discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.

07.
bioRxiv (Bioinfo) 2026-06-14

Somatic variant detection in normal tissues from single-cell sequencing data

A crucial advantage of single-cell sequencing (SCS) is its ability to identify somatic variants in individual cells, enabling phylogenetic analysis of cellular populations within bulk tissues. While identifying somatic variants in tumor tissues via SCS has become a common practice, doing so in normal tissues remains challenging due to the rarity of somatic variants in normal cells. To evaluate the feasibility of somatic variant calling from widely available single-nucleus RNA-seq (snRNA-seq) and single-nucleus ATAC-seq (snATAC-seq) data, we profiled a Cell-line mix of six HapMap samples prepared by the SMaHT consortium using 10x Genomics 5' snRNA-seq (12k cells with 36k mean reads per cell) and snATAC-seq (11k cells with 14k median high-quality fragments per cell) for variant calling. PacBio long-read whole genome sequencing (WGS) data (109x) generated from individual cell lines were used as ground truth. Two computational tools, Monopogen and SComatic, were used for somatic variant calling from the SCS data. Monopogen achieved single nucleotide variant (SNV) detection accuracies of 93.30% in the snRNA-seq and 99.64% in the snATAC-seq data, both of which outperformed SComatic (74.35% and 94.29%, respectively). Monopogen also consistently detected somatic SNVs at cellular fractions as low as 0.5% (2.54% in snRNA and 0.81% in snATAC) in individual samples. Notably, snATAC-seq exhibited higher genomic coverage breadth and larger number of variants detected than snRNA-seq. While the SCS data have lower overall genome coverage than that of the bulk WGS, the single-cell level variant resolution allows Monopogen to assign variants to their cells of origin with over 80% accuracy in both RNA and ATAC modalities, thereby facilitating studies of clonal evolution and cell-type-specific mutagenesis. Other benchmarking methods were also evaluated (DeepVariant, Cellsnp-lite and Mutect2) for comparison. In conclusion, our study demonstrated the feasibility of performing reliable single-cell somatic mutation calling in a cell-line mixture and discussed the strengths and limitations of current computational methods when applied to normal tissues.

08.
medRxiv (Medicine) 2026-06-18

Multicluster measles outbreak with a substantial proportion of modified cases in Tokyo, Japan, January-May 2026

Tokyo experienced a measles outbreak (260 cases) in early 2026 despite elimination status. Adults aged 20-39 years were most affected, and 38% of cases were modified measles, increasing with prior vaccination. Although incidence rose until April, the effective reproduction number; R(t) fell below 1, consistent with outbreak control. Multiple clusters were identified, but many cases lacked epidemiological links, suggesting that modified measles is less likely to be considered in differential diagnosis. Intensive contact tracing and surveillance contributed to limiting transmission.

09.
arXiv (CS.LG) 2026-06-24

SkyJEPA: Learning Long-Horizon World Models for Zero-Shot Sim-to-Real Control of Quadrotors

arXiv:2606.23444v2 Announce Type: replace-cross Abstract: Accurate dynamics models are critical for informed decision-making in robotic systems, particularly for agile aerial vehicles operating under uncertainty. Neural network dynamics models are attractive for capturing complex nonlinear effects, but existing predictive approaches struggle with long-horizon forecasting because their autoregressive rollout mechanism amplifies errors over time. Joint Embedding Predictive Architectures (JEPAs) offer a compelling alternative by modeling dynamics in latent space, yet prior JEPA-style methods for robot navigation have been studied primarily for kinematic-level planning, with limited investigation in high-frequency control. In this work, we introduce the JEPA-style model for real-time quadrotor control. The proposed approach combines a latent dynamics model with a novel physics-inspired prober that maps frozen latents to interpretable state, enabling physically grounded long-horizon prediction. Additionally, we combine the learned model with a sampling-based optimal control solution to take advantage of its predictive capabilities for real-time control on embedded hardware. Finally, to reduce the dependence on expensive and unsafe real-world data collection, we develop a structured pipeline for automated dataset generation. Extensive open-loop and outdoor closed-loop experiments demonstrate accurate prediction, robust zero-shot sim-to-real transfer, and strong generalization across diverse operating conditions.

10.
arXiv (quant-ph) 2026-06-12

Entanglement Detection by Approximate Entanglement Witnesses

arXiv:2402.14755v2 Announce Type: replace Abstract: The problem of determining whether a given quantum state is separable is known to be computationally difficult. We develop an approach to this problem based on approximations of convex polytopes in high dimensions. By showing that a convex polytope constructed from a finite number of hyperplanes approximates the Euclidean ball arbitrarily well in high dimensions, we find evidence that a finite set of approximate entanglement witnesses is potentially sufficient to determine the entanglement of a state with high probability.

11.
arXiv (CS.AI) 2026-06-15

The Insurability Frontier of AI Risk: Mapping Threats to Affirmative Coverage, Silent Exposures, and Exclusions

arXiv:2605.18784v2 Announce Type: replace-cross Abstract: The rapid diffusion of agentic AI has created a new coverage problem for commercial insurance: some AI-mediated losses are now affirmatively insured, some create silent-AI exposure under legacy cyber, technology errors-and-omissions (E&O), directors-and-officers (D&O), employment practices liability (EPLI), crime, and media policies, and others are being actively excluded. This paper maps that emerging boundary by coding 55 AI threat classes against 26 insurance products, endorsements, and exclusion regimes using public carrier materials and OWASP/MITRE threat catalogs. We identify a four-tier insurability frontier: affirmatively insured perils, silent-AI exposures, actively excluded perils, and perils outside conventional private insurance structures. Our coding measures publicly claimed positioning rather than executed contract wording; the headline statistics describe what carriers publicly state about coverage, not what would be paid in any specific claim. Three patterns emerge. First, affirmative AI coverage is beginning to differentiate by primary risk emphasis: public materials often position Munich Re around model performance and drift, Armilla and parts of the Lloyd's market around hallucination and broader AI liability, Tokio Marine Kiln and CFC around IP and technology E&O concerns, Apollo ibott around emerging autonomous system liability, and Coalition around deepfake and AI-enabled cyber response. Second, legacy lines retain silent-AI exposure where AI is an instrumentality rather than the legal cause of loss. Third, foundation model concentration is the clearest genuinely novel insurability frontier because upstream model failure can correlate losses across many cedents at once; the relevant market design question is which insurability constraint each candidate structure relaxes, not merely which systemic risk template exists.

12.
arXiv (CS.CV) 2026-06-16

Towards Next-Generation Healthcare: A Survey of Medical Embodied AI for Perception, Decision-Making, and Action

Foundation models have demonstrated impressive performance in enhancing healthcare efficiency across a wide range of medical applications. Nevertheless, their limited ability to perceive, understand, and interact with the physical world significantly constrains their effectiveness in real-world clinical workflows, where safety-critical decision-making and physical execution are tightly coupled. Recently, embodied artificial intelligence (AI) has emerged as a promising physical-interactive paradigm for intelligent healthcare, enabling agents to operate in complex medical environments. As research in this area rapidly expands, understanding how intelligent agents function as integrated, end-to-end systems in clinical environments becomes increasingly critical. However, existing surveys on medical embodied AI largely emphasize individual aspects or functional components, lacking a unified system-level organization of the field. To support and consolidate recent advances, we systematically survey the core components of medical embodied AI, with a particular emphasis on the coordinated integration of perception, decision-making, and action. We further review representative medical applications and relevant datasets, and we analyze the major challenges encountered in real-world clinical practice. Finally, we discuss key directions for future research in this rapidly evolving field. The associated project can be found at https://github.com/VMVLab/Medical_Embodied_AI_Paper_List.

13.
arXiv (CS.CV) 2026-06-16

Gaussian Spatial Priors for Anatomy-Aware Object Detection in Surgical Videos

Detecting anatomical structures in surgical video is essential for intraoperative safety frameworks such as the Critical View of Myopectineal Orifice (CVMPO) in inguinal hernia repair. While prominent structures like the Cooper's Ligament and Triangle of Doom are reliably detected by standard methods, smaller structures such as the epigastric vessels remain challenging due to their visual ambiguity and intermittent visibility. We observe that the spatial relationship between structures is anatomically constrained, and propose a Gaussian Spatial Prior (GSP) module that encodes this relationship as a compact, parametric bias injected into the self-attention of a DAB-DETR decoder. The prior is computed offline from training annotations as a small set of frozen Gaussian parameters and recomputed at each decoder layer using the iteratively refined reference points. On a dataset of inguinal hernia repair videos with 5-fold cross-validation, GSP improves dependent class detection by $+33.5\%$ ($AP_{50}$) over DAB-DETR and $+53.9\%$ over YOLOv26, while also improving anchor detection by $+6.0\%$. These gains are statistically significant across all folds ($p=0.012$, paired $t-$test).

14.
arXiv (quant-ph) 2026-06-12

Toward Entanglement Bootstrap for Conformal Field Theory in Any Dimension

arXiv:2606.12540v1 Announce Type: cross Abstract: Given a quantum critical wavefunction in any dimension, we propose a reconstructed Hamiltonian, analogous to the ones previously found for 1+1d CFT and for 2+1d bosonic liquid topologically-ordered states. We test numerically that, for known regularized approximate CFT groundstates (on the icosahedron and the fuzzy sphere), (1) they are close to the groundstate of their reconstructed Hamiltonian, and (2) the spectrum of their reconstructed Hamiltonian on the unit sphere has CFT properties (integer spacing of descendants) and matches known low-lying energies. We show that this provides an automated method to improve the finite-size effects in a fixed Hilbert space.

15.
arXiv (CS.LG) 2026-06-11

Quantum Occam Learning: Sample-Supported Expressibility for Circuit-Based Quantum Learning

arXiv:2606.12211v1 Announce Type: cross Abstract: A central principle in quantum machine learning is that an ansatz should be expressive enough to represent the quantum data of interest. Yet, the expressibility is statistically meaningful only insofar as it can be learned from finitely many copies of an unknown quantum state. In this work, we develop an information-theoretic Occam theory for quantum data generated by finite-size quantum circuits. For the class $S_{n,G}$ of $n$-qubit pure states preparable with at most $G$ two-qubit gates, a metric-entropy argument gives the realizable sample law $\widetilde{\Theta}(G/\epsilon^2)$ in the circuit-limited regime. For an arbitrary source $\hat{\rho}$, we introduce the best $G$-gate approximation error $d_G(\hat{\rho})$ and the approximate circuit complexity $C_\eta(\hat{\rho})$. We prove an agnostic quantum Occam theorem: with $M$ copies, one can learn up to the best $G$-gate approximation error plus a statistical penalty $\widetilde{O}(\sqrt{G/M})$. We then remove the need to know $G$ in advance through an adaptive model-selection theorem whose oracle inequality selects the circuit complexity justified by the data. Matching lower bounds yield a sample-supported expressibility law: at trace-distance accuracy $\epsilon$, $M$ samples can support only $G_supported \simeq M\epsilon^2$ gates, up to logarithmic factors and tomography saturation at $2^n$. Thus, the circuit complexity becomes an adaptive statistical resource rather than a static promise. Our framework turns bounded circuit complexity into a model-selection principle for quantum machine learning.

16.
arXiv (CS.CL) 2026-06-18

Beyond Tokenization: Direct Timestep Embedding and Contrastive Alignment for Time-Series Question Answering

Recent advances in large language models (LLMs) have given rise to time-series question answering (TSQA), which formulates time-series analysis as natural-language question answering. However, directly feeding raw numerical series into LLMs suffers from a tokenization bottleneck: Byte Pair Encoding fragments continuous values into unstable tokens whose embeddings lack meaningful metric structure, resulting in the loss of magnitude, scale, and trend information. Prior methods use patch-based encoders that split the series into fixed windows, locking in one granularity that breaks patterns and hides exact timesteps, through a separate module that rarely transfers across datasets with different lengths or sampling rates. To address this challenge, we propose CADE (Contrastive Alignment with Direct Embedding), a novel framework for TSQA built upon two key components: direct timestep embedding and semantic alignment. The proposed framework maps each timestep directly into the LLM embedding space through a point-wise linear encoder and MLP projector, preserving exact index-level access while eliminating the need for patching and padding. To further bridge the semantic gap between time-series and language representations, we introduce a novel one-directional supervised contrastive loss that aligns time-series embeddings with frozen class-name text anchors. Experimental results on the public Time-MQA benchmark demonstrate that our framework consistently improves performance across six TSQA tasks, outperforming both open-source and proprietary LLM baselines.

17.
arXiv (CS.LG) 2026-06-12

Scalable anomaly detection via a univariate Christoffel function

arXiv:2606.12483v1 Announce Type: new Abstract: Anomaly detection plays a critical role in identifying unusual patterns across domains such as fraud detection, network intrusion, and system fault diagnosis. Recently, Christoffel function-based methods, rooted in polynomial optimization, have emerged as promising alternatives to deep learning due to their strong mathematical foundations and computational frugality. However, their practical applicability is hindered by the need to invert a matrix whose size grows exponentially with the data dimension, rendering the method intractable even for moderate-dimensional datasets. This paper addresses the dimensionality limitations of Christoffel function-based anomaly detection while preserving its key theoretical properties, i.e., the on-off support dichotomy behavior and the accurate support shape capture. We introduce UCF, a univariate Christoffel function which is based on the squared distance between the query point and the support points. Extensive experiments on the ADBench benchmark demonstrate that UCF consistently outperforms 14 state-of-the-art baselines in terms of Average Precision. By resolving the scalability bottleneck of the Christoffel Function, this work expands the toolkit of anomaly detection methods with a robust, theoretically grounded, and universally applicable approach.

18.
arXiv (CS.AI) 2026-06-16

Cognitive Debt: AI as Intellectual Leverage and the Dynamics of Systemic Fragility

作者:

arXiv:2606.15078v1 Announce Type: new Abstract: We develop a formal theory of cognitive debt: the stock of unverified reasoning obligations that accumulates when individuals use AI as a substitute rather than a complement for first-principles cognition. The model features two state variables per agent, cognitive capital and cognitive debt, and a multiplicative production technology in which cognitive capital functions as collateral that determines the return to AI adoption. We establish six propositions. Rational agents incur positive cognitive debt because the costs are deferred, partially external, and masked by short-run productivity gains. Tranquil periods lower subjective risk assessments, raise AI substitution intensity, and compound leverage, generating a cognitive Minsky moment in which subjective risk falls while true systemic fragility rises. Expected crisis losses are convex in aggregate leverage. Post-crisis, output-target pressure can produce a false-correction loop in which agents patch AI failures with more AI. The decentralised equilibrium over-adopts substitutive AI relative to the social optimum because of systemic risk, cognitive public goods, and arms-race externalities. In a two-type heterogeneous-agent economy, high-cognitive-capital agents adopt AI more intensively and may eventually erode their unaided cognitive capital below that of initially lower-skilled agents.

19.
arXiv (quant-ph) 2026-06-15

Efficient and simple Gibbs state preparation of the 2D toric code via duality to classical Ising chains

arXiv:2508.00126v2 Announce Type: replace Abstract: We introduce the notion of polynomial-depth duality transformations, which relates two sets of operator algebras through a conjugation by a poly-depth quantum circuit, and make use of this to construct efficient Gibbs samplers for a variety of interesting quantum Hamiltonians as they are poly-depth dual to classical Hamiltonians. This is for example the case for the 2D toric code, which is demonstrated to be poly-depth dual to two decoupled classical Ising spin chains for any system size, and we give evidence that such dualities hold for a wide class of stabilizer Hamiltonians. Additionally, we extend the above notion of duality to Lindbladians in order to show that mixing times and other quantities such as the spectral gap or the modified logarithmic Sobolev inequality are preserved under duality.

20.
arXiv (CS.LG) 2026-06-18

Generalised Eigenvalue Geometry of Semantic Adversarial Attacks

arXiv:2606.19212v1 Announce Type: cross Abstract: Recent empirical work shows that semantically equivalent paraphrases can fool financial sentiment classifiers: although a paraphrase remains close to the original under a strong reference embedding, it may shift the target model's representation enough to change the predicted class. Existing robustness theory either assumes a single-model threat model or focuses mainly on empirical attack algorithms. We develop a continuous local model of semantic paraphrase perturbations that captures this two-model structure. We show that the worst-case local displacement of the target representation, subject to a proxy-model budget, is governed by the largest generalised eigenvalue of a matrix pencil $(A,B)$ constructed from the Jacobians of the two embedding maps. The resulting attackability index $\lambda^*(x)$ is intrinsic to the local paraphrase geometry and the chosen embedders, yields a closed-form prediction-flip condition for affine readouts, and supports conservative population and finite-sample attackability certificates. For uniform control over classes of affine readouts, we derive a distribution-free VC bound for binary attackability indicators and a scale-sensitive margin bound based on an attackability-adjusted margin that subtracts a local geometric penalty from the standard classifier margin. We also connect the continuous theory to discrete paraphrase search, identify an asymmetry between successful and unsuccessful finite searches, and give a covering condition under which the discrete and continuous settings agree. Finally, we propose an empirical verification framework using soft-token relaxations and generated paraphrase sets to assess the local eigenvalue geometry, prediction-flip condition, and finite-search approximation on a deployed financial-text classifier.

21.
arXiv (CS.CV) 2026-06-16

Teacher-Student Structure for Domain Adaptation in Ensemble Audio-Visual Video Deepfake Detection

The rapid advancement of generative AI models is leading to more realistic deepfake media, encompassing the manipulation of audio, video, or both. This raises severe privacy and societal concerns. Numerous studies in this area have yielded promising intra-domain results; however, these models frequently exhibit decreased efficacy when faced with data from dissimilar domains. Consequently, recent deepfake detection approaches focus on enhancing the generalization ability through multiple techniques that incorporate all input modalities, including audio, images, and their interactions. In this regard, we propose the EAV-DFD method, a generalized deep ensemble audio-visual model (EAV-DFD) combined with a domain adaptation mechanism utilizing a teacher-student framework to enhance the model's ability to perform and generalize effectively across unseen domains. To evaluate the model's performance, we used the FakeAVCeleb dataset as the primary domain and the DFDC, Deepfake_TIMIT, and PolyGlotFake datasets as an unseen domain. Our experimental results demonstrate that the proposed framework is efficient in domain adaptation, improving AUC performance of the model by 4.09%, 17.94%, and 0.5% on three unseen datasets, using only a small portion of them to train the student model. This leads to a novel deepfake detection model capable of adapting to new domains and interpreting which modality has been manipulated, highlighting the potential of our approach for real-world applications.

22.
arXiv (CS.LG) 2026-06-11

MemNovo: Look Back at the Spectrum for Balanced De Novo Peptide Sequencing from Mass Spectrometry

arXiv:2606.11868v1 Announce Type: new Abstract: De novo peptide sequencing from tandem mass spectrometry is pivotal in proteomics, enabling identification of novel peptides without reference databases. While recent Transformer-based encoder-decoder models have achieved remarkable performance, we uncover a critical pathology in their inference dynamics. Through comprehensive feature scaling experiments, we demonstrate that existing auto-regressive peptide decoders tend to over-rely on generated-sequence priors while progressively under-utilizing fine-grained physical evidence from the input mass spectrum. This phenomenon leads to suboptimal results, where generated peptide sequences are biologically plausible yet not faithful to the input spectrum. To rectify this, we propose MemNovo, a training-free and plug-and-play mechanism that re-balances peptide and spectral contributions at inference time. MemNovo alleviates the information bottleneck by establishing a persistent spectral memory bank and injecting retrieved features directly into the final decoding stage via an ultra-conservative residual connection. Theoretical analysis confirms that this mechanism restores the mutual information between the decoder state and the raw spectrum. Extensive experiments on the Nine Species benchmark with two representative baselines, Casanovo and InstaNovo, demonstrate that MemNovo consistently improves both amino acid precision and peptide precision, achieving up to 39.1% relative improvement in peptide precision for Casanovo and up to 3.9% for InstaNovo, with negligible computational overhead.

23.
arXiv (CS.AI) 2026-06-16

Learning in the Recurrent State: Gradient Descent with Linear Recurrent Networks

arXiv:2410.11687v3 Announce Type: replace-cross Abstract: Linear recurrent networks (LRNNs) offer linear-time sequence modeling, but standard recurrent updates do not directly expose the supervised products needed for in-context gradient descent. We propose a sufficient constructive inductive bias for LRNNs: equip a diagonal recurrent state with multiplicative readout and a short sliding-window cross-product self-attention update. The resulting architecture, Gradient-based Recurrent In-context Learner (GRIL), can implement minibatch gradient descent on a task-specific linear predictor during a single forward pass. The same design extends to multi-step updates and cross-entropy classification, with a limited MLP-based extension to non-linear regression. Empirically, trained GRILs recover the behavior and parameters predicted by the construction on synthetic ICL tasks, and the same architectural bias yields useful performance on Long Range Arena and language modelling. These results present windowed cross-product self-attention as a practical, testable inductive bias for LRNNs that learn in context through gradient-descent-like updates.

24.
arXiv (CS.LG) 2026-06-16

Physics-conforming Latent Twins

arXiv:2606.15053v1 Announce Type: new Abstract: Surrogate models are central to scientific machine learning, where they enable fast prediction, simulation, inference, and control for complex physical systems. For time-dependent problems, however, accurate interpolation of training trajectories is not sufficient: reliable surrogates should also respect the conservation laws, invariants, admissibility conditions, and dissipative structures that give those trajectories physical meaning. We introduce Physics-conforming Latent Twins, a framework for learning latent surrogate solution operators whose dynamics satisfy selected physical principles by design. The method builds on the Latent Twin formulation by jointly learning an encoder, a decoder, and a latent flow map between arbitrary time-indexed states, while constraining the latent dynamics to preserve or dissipate prescribed structural quantities. We develop a constraint-transfer viewpoint that connects physical structure in the original state space with compatible constraints in latent space, and prove structure-preservation bounds showing how latent enforcement improves control of physical defects after decoding. We also derive algebraic conditions for latent flow maps that preserve linear and quadratic invariants or enforce dissipative inequalities. Numerical experiments on representative ODE and PDE benchmarks demonstrate improved constraint satisfaction, structural fidelity, and qualitative long-time behavior while maintaining accurate surrogate prediction.

25.
arXiv (quant-ph) 2026-06-24

Clifford Volume and Free Fermion Volume: Complementary Scalable Benchmarks for Quantum Computers

arXiv:2512.19413v2 Announce Type: replace Abstract: As quantum computing advances toward the late-NISQ and early fault-tolerant eras, scalable and platform-independent benchmarks are essential for quantifying computational capacity in a classically verifiable manner. We introduce two volumetric benchmarks, Clifford Volume and Free Fermion Volume, that assess quantum hardware by testing the execution of random Clifford and free fermion operations. These two groups of unitaries possess a combination of properties that make them ideal for benchmarking: (i) each is individually efficient to simulate classically, enabling verification at scale; (ii) together they form a universal gate set; (iii) they serve as essential algorithmic primitives in practical applications (including shadow tomography and quantum chemistry); and (iv) their definitions are formulated abstractly, without explicit reference to hardware-specific features such as qubit connectivity or native gate sets. This framework thus enables scalable and fair cross-platform comparisons and tracks meaningful computational advancement. We demonstrate the practical feasibility of these benchmarks through extensive numerical simulations across realistic noise parameters and through experimental validation on Quantinuum's H2-1 trapped-ion quantum computer, which achieves a Clifford Volume of 34.