Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-11

Second-order PACF asymptotics and discrimination between fractional Gaussian noise and $\operatorname{FARIMA}(0,d,0)$

Authors:

arXiv:2605.31416v2 Announce Type: replace-cross Abstract: Fractional Gaussian noise and $\operatorname{FARIMA}(0,d,0)$ have the same long-memory pole $|\theta|^{-2d}$ and hence the same leading PACF law $\alpha(n)\sim d/n$. We show that this agreement breaks at the first non-universal order. For $0

02.
medRxiv (Medicine) 2026-06-23

Acute Ischemic Stroke Detection on Non-Contrast CT: A Deep Learning Approach

Acute ischemic stroke (AIS) is a leading cause of disability and death while effective treatment requires quick and accurate diagnosis. Non-contrast CT (NCCT) is widely used in the initial screening of AIS, but stroke detection is challenging because early changes on NCCT are subtle or indistinguishable. Using hyperacute NCCTs as inputs and diffusion-weighted MRI as ground truth, we trained a deep learning algorithm to classify patients with AIS and segment the stroke lesions. We hypothesized that this approach would accurately detect hyperacute tissue density changes on NCCT. For the classification task, our ResNet50 model delivered the best performance (with 98.5% accuracy, 97.4% precision, and 100% recall on an evaluation set). Classification performance remained strong when restricted to lesions smaller than 5 mL, which constituted the majority of our evaluation cases. For the segmentation task accomplished using a range of U-Net architectures, performance was acceptable for large lesions and declined sharply for smaller lesions. Together, these findings demonstrate the feasibility of deep learning for AIS detection and represent a step towards faster triage and treatment for stroke patients.

03.
arXiv (CS.CL) 2026-06-12

Constrained Semantic Decompression in LLMs through Persian Proverb-Conditioned Story Generation

Transforming a dense, abstract proverb into an engaging and morally faithful narrative requires deep cultural understanding and robust semantic grounding. We frame this problem as a constrained semantic decompression task and study proverb-conditioned story generation as a testbed for abstraction-to-realization in large language models (LLMs). Focusing on Persian, we introduce the Proverb Aligned Narrative Dataset (PAND), pairing proverbs with human-written stories and explicit meanings. By a hybrid evaluation framework that combines human-calibrated LLM-as-a-Judge with structural metrics, we analyze model behavior across multiple prompting regimes. Our findings reveal a persistent decompression gap: current LLMs often achieve strong surface-level fluency while failing to faithfully instantiate the underlying moral and causal structure encoded in proverbs. We further show that explicit reasoning and iterative refinement can partially mitigate these failures, suggesting that many decompression errors arise from difficulties in translating abstract meaning into narrative form rather than a complete lack of relevant knowledge. Our proposed task naturally extends to other forms of compressed cultural knowledge.

04.
arXiv (CS.CL) 2026-06-16

On Defining Erasure Harms for NLP

The deployment of NLP systems has raised concerns about harms they might produce, including representational harms. Recent literature has begun to conceptualize and measure one such harm, the harm of erasure. Nevertheless, the field lacks a clear and cohesive conceptual foundation for identifying and measuring erasure. Existing conceptualizations of erasure are often broad – making it difficult to identify what is needed to establish and measure erasure – or else specific to particular settings – facilitating measurement for those settings but potentially challenging to adapt to other settings. To address this gap, we develop and propose a structured definition of erasure that clarifies what components are necessary for establishing whether erasure has occurred, which practitioners need to explicitly articulate and operationalize in order to measure erasure.

05.
arXiv (math.PR) 2026-06-24

Conditioning of incoherent sub-dictionaries sampled from a coherent dictionary

Authors:

arXiv:2606.24323v1 Announce Type: new Abstract: Motivated by the desire to find a realistic and stable random model for $d$-dimensional signals, that are sparse in a transform-based and thus often coherent frame, such as a wavelet or a Gabor frame, we study the conditioning of incoherent sub-dictionaries sampled from a coherent dictionary, such as a unit norm frame. In particular, we show that if the sub-dictionary is selected via a coherence rejective Poisson sampling model, it is well-conditioned with high probability, as long as its expected size scales as $d/\log (K)$, where $K$ is the number of dictionary elements. The result is proved for the more general case of sampling quadratic sub-matrices from a real but not necessarily symmetric $K\times K$ matrix with zero diagonal, where coherence rejective sampling is defined via a symmetric mask, that acts as coherence substitute.

06.
medRxiv (Medicine) 2026-06-12

Estimating the effectiveness of syndromic screening at airports for Bundibugyo ebolavirus disease

Authors:

We used a stochastic simulation model to estimate the effectiveness of combined exit and entry airport screening for Bundibugyo ebolavirus disease (BVD), using natural-history parameters from a Bayesian re-analysis of the 2012 Isiro outbreak. For a 12-hour international flight from DRC or Uganda at 86% screening sensitivity, we estimate 65% of infected travellers would arrive undetected (95% CrI: 38 - 76%). The main driver of this outcome is the relative duration of the the incubation period (approximately 7.7 days) and the onset-to-severe-disease interval (approximately 4 days): most infected travellers board before symptom onset and are undetectable by any syndromic screen, whilst those who are symptomatic progress rapidly to illness severe enough to preclude travel. This is compounded during active epidemic growth, when recently exposed (and therefore pre-symptomatic) cases are overrepresented among travellers. Syndromic airport screening offers limited protection against BVD spread via air travel, and should be complemented by outbreak control at source and strengthened clinical surveillance in receiving countries with high travel connectivity to affected areas.

07.
arXiv (quant-ph) 2026-06-16

Bright Emission from Dark Sources in Hyperbolic Media

arXiv:2606.16071v1 Announce Type: cross Abstract: Hyperbolic media enable ultra-strong light-matter interactions through their extreme field localization and small mode volumes, but low-loss realizations are fundamentally limited to the mid-infrared, owing to the long lifetimes of optical phonons in high-quality crystals. Here we show that bright emitters operating at visible or near-infrared frequencies can be used to generate radiation in this regime by inducing mid-infrared population dynamics, thereby creating a source in the hyperbolic frequency band without a corresponding dipole transition. We demonstrate that even a source with vanishing dipole and higher multipole moments - strictly non-radiating in any isotropic medium - becomes radiatively active in a hyperbolic environment. This enables visible and near-infrared control of light-matter interactions in polaritonic hyperbolic materials, establishing a new low-loss solid-state quantum optics platform.

08.
arXiv (CS.AI) 2026-06-24

DynamicPO: Dynamic Preference Optimization for Recommendation

arXiv:2605.00327v3 Announce Type: replace-cross Abstract: In large language model (LLM)-based recommendation systems, direct preference optimization (DPO) effectively aligns recommendations with user preferences, requiring multi-negative objective functions to leverage abundant implicit-feedback negatives and sharpen preference boundaries. However, our empirical analyses reveal a counterintuitive phenomenon, preference optimization collapse, where increasing the number of negative samples can lead to performance degradation despite a continuously decreasing training loss. We further theoretically demonstrate that this collapse arises from gradient suppression, caused by the dominance of easily discriminable negatives over boundary-critical negatives that truly define user preference boundaries. As a result, boundary-relevant signals are under-optimized, weakening the model's decision boundary. Motivated by these observations, we propose DynamicPO (Dynamic Preference Optimization), a lightweight and plug-and-play framework comprising two adaptive mechanisms: Dynamic Boundary Negative Selection, which identifies and prioritizes informative negatives near the model's decision boundary, and Dual-Margin Dynamic beta Adjustment, which calibrates optimization strength per sample according to boundary ambiguity. Extensive experiments on three public datasets show that DynamicPO effectively prevents optimization collapse and improves recommendation accuracy on multi-negative preference optimization methods, with negligible computational overhead. Our code and datasets are available at https://github.com/xingyuHuxingyu/DynamicPO.

09.
arXiv (CS.AI) 2026-06-17

No-Free-Fairness: Fundamental Limits and Trade-offs in Learning Systems

Authors:

arXiv:2606.17810v1 Announce Type: cross Abstract: In this paper, we establish a set of theoretical impossibility results, termed the No-Free-Fairness theorems, that identify three fundamental sources of disparity in learning systems. First, we show that when a task exhibits irreducible cost on a subgroup, any decision rule must trade off overall performance with disparity, yielding an inherent fairness–cost frontier. Second, we prove that even in ideal, noise-free settings where a perfectly fair and accurate solution exists, finite-sample learning alone induces nontrivial subgroup disparity, ruling out distribution-free fairness guarantees. More seriously, enforcing strict relative fairness creates a statistical bottleneck: achieving low cost may require exponentially many samples. Third, we show that limitations of the model class can independently induce disparity: if the model cannot represent accurate solutions for a subgroup, fairness remains unattainable regardless of data or training procedure. Overall, these results demonstrate that unfairness is not solely a consequence of biased data or suboptimal optimization, but arises from the intrinsic structure of decision problems, the constraints of finite data, and the expressivity of models. Our framework applies broadly beyond standard supervised learning, and suggests that achieving fairness requires explicit trade-offs and should be treated as a core design consideration.

10.
arXiv (CS.CV) 2026-06-16

EcoBin: A Two-Stage Deep Convolutional Neural Network for Contamination-Aware Waste Classification

Waste classification models have become highly accurate at sorting waste, often exceeding 95% on benchmark datasets. However, these models fail to account for contamination in recyclable waste. We present EcoBin, a two-stage deep convolutional neural network that classifies household waste by its disposal pathway and that explicitly accounts for contamination. The first stage is a base waste classifier built on an EfficientNetV2-S backbone that assigns each of the thirty waste categories in our dataset to one of four disposal pathways. The second stage is a contamination classifier that inspects any item routed toward recycling and overrides the decision to garbage when contamination is detected. Because no public dataset of contaminated recyclables exists, we synthesize one by segmenting images of clean recyclable objects with a U2-Net model and compositing realistic contamination textures onto their surfaces. The first stage achieves 87.42% test accuracy and a 96.13% pathway-adjusted accuracy. Meanwhile, the contamination stage distinguishes clean from contaminated items with a 0.99 ROC-AUC. On a test set of contaminated recyclables, the complete pipeline routes 24 of 25 items correctly, compared with only 1 of 25 for the base classifier alone. A McNemar's test confirms that the improvement contributed by the contamination stage is statistically significant (p < 0.001).

11.
arXiv (CS.AI) 2026-06-11

Feature-Aligned Speech Watermarking for Robustness to Reconstruction Distortions

arXiv:2606.11828v1 Announce Type: cross Abstract: Audio watermarking aims to embed identifiable information into audio while remaining imperceptible. Existing methods adopt high-fidelity, low-energy designs to preserve perceptual quality, but the resulting watermarks lack robustness under suppression by speech reconstruction models. Improving robustness is challenging due to the inherent robustness-fidelity trade-off in existing designs, where increasing watermark energy improves robustness but reduces fidelity. To address this problem, we propose a feature-aligned watermarking method that aligns the watermark with the original speech feature distribution, allowing higher watermark energy to improve robustness while preserving imperceptibility. We use a pretrained speech codec to generate a pseudo-speech watermark and fuse it into the spectrogram of the input audio, with VAD loss and perceptual losses guiding embedding within voiced regions. Experiments show that our method maintains imperceptibility comparable to existing approaches while substantially improving robustness under both seen and unseen speech reconstruction models.

12.
arXiv (CS.CL) 2026-06-16

TokenPilot: Cache-Efficient Context Management for LLM Agents

As LLM agents are deployed in long-horizon sessions, context accumulation drives up inference costs. Existing approaches utilize text pruning or dynamic memory eviction to minimize token footprints; however, their unconstrained sequence mutations alter layouts, introducing prefix mismatches and cache invalidation. This reveals a critical trade-off between text sparsity and prompt cache continuity. To address this, we present TokenPilot, a dual-granularity context management framework. Globally, Ingestion-Aware Compaction acts as a framework harness to stabilize prompt prefixes and eliminate open-world environmental noise at the ingestion gate. Locally, Lifecycle-Aware Eviction monitors the ongoing residual utility of context segments, enforcing a conservative batch-turn schedule to offload content segments only when task relevance expires. Experiments on PinchBench and Claw-Eval under both isolated and continuous modes demonstrate that TokenPilot reduces costs by 61% and 56% in isolated mode, and 61% and 87% in continuous mode, while maintaining competitive performance compared to prior systems. TokenPilot has been integrated into LightMem2 at https://github.com/zjunlp/LightMem2.

13.
arXiv (CS.AI) 2026-06-11

ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning

arXiv:2603.22934v3 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) improves large language model applications by grounding generation in retrieved evidence, but also introduces corpus poisoning as a new attack surface. In this setting, an adversary injects or edits passages so that they enter the Top-$K$ results for target queries and influence downstream generation. Existing defences often rely on content filtering, auxiliary models, or generator-side reasoning, which complicates deployment. We propose ProGRank, a post hoc, training-free retriever-side defence for dense-retriever RAG. ProGRank stress-tests each query–passage pair under mild randomized perturbations, extracts probe gradients from a small fixed parameter subset, and derives two instability signals: representational consistency and dispersion risk. It then combines these signals with a score gate for reranking. ProGRank preserves the original passage content, requires no retraining, and supports a surrogate-based variant when the deployed retriever is unavailable. Experiments across datasets, retrievers, attacks, and retrieval-stage and end-to-end settings show that ProGRank improves robustness and maintains a favorable robustness–utility trade-off, including under adaptive evasive attacks.

14.
arXiv (CS.AI) 2026-06-24

Female-RHINO: A Real-Time Scanner-Integrated Framework for Automated Quantitative Uterine MRI Analysis and Structured Reporting

arXiv:2606.24390v1 Announce Type: cross Abstract: Standardized assessment of uterine MRI remains challenging due to anatomical variability, observer dependence, and the lack of workflow-integrated automated analysis tools. This work presents Female-RHINO: (R)eproductive (H)ealth (I)maging A(N)alysis T(O)ol, a real-time AI-assisted framework for automated quantitative uterine MRI analysis and structured reporting during image acquisition. We present an end-to-end system that integrates inline communication with the MRI scanner and deep learning-based analysis to derive quantitative uterine biomarkers from sagittal T2-weighted pelvic MRI. The framework combines segmentation and anatomical landmark detection models trained and evaluated on more than 500 multi-center datasets spanning diverse protocols, vendors, and patient populations. It performs volumetry, detects and quantifies common incidental findings such as fibroids and Nabothian cysts, and extracts six anatomical landmarks for biometric assessment. Results are compiled into a structured clinician-oriented report with integrated visualizations, without manual interaction. Evaluation on independent retrospective and prospective cohorts demonstrated robust performance across varying acquisition settings. Mean Dice similarity coefficients were 0.82 for the uterus and 0.80 for fibroids, with lower but consistent agreement for Nabothian cysts. Landmark detection achieved a mean radial error of 3.7 mm. End-to-end processing was completed in under 70 seconds, enabling availability of results during the ongoing scan. Prospective deployment yielded immediate, standardized, and reproducible analyses supported by inter-observer agreement. The proposed system enables real-time scanner-integrated AI for automated uterine MRI analysis and reporting, with potential to improve standardization, efficiency, and clinical workflow in pelvic imaging.

15.
Nature Medicine 2026-06-10

Brain Health for Economic Resilience: a data-driven framework for the brain-positive economic transition

Announced in this Comment and in collaboration with Nature Medicine is the convening of the Brain Health for Economic Resilience Commission, a global, transdisciplinary effort to define, measure and operationalize brain health and cognitive capacity as foundational drivers of economic resilience.

16.
arXiv (CS.AI) 2026-06-18

WebSP-Eval: Evaluating Web Agents on Website Security and Privacy Tasks

arXiv:2604.06367v2 Announce Type: replace-cross Abstract: Web agents automate browser tasks, ranging from simple form completion to complex workflows like ordering groceries. While current benchmarks evaluate general-purpose performance~(e.g., WebArena) or safety against malicious actions~(e.g., SafeArena), no existing framework assesses an agent's ability to successfully execute user-facing website security and privacy tasks, such as managing cookie preferences, configuring privacy-sensitive account settings, or revoking inactive sessions. To address this gap, we introduce WebSP-Eval, an evaluation framework for measuring web agent performance on website security and privacy tasks. WebSP-Eval comprises 1) a manually crafted task dataset of 200 task instances across 28 websites; 2) a robust agentic system supporting account and initial state management across runs using a custom Google Chrome extension; and 3) an automated evaluator. We evaluate a total of 8 web agent instantiations using state-of-the-art multimodal large language models, conducting a fine-grained analysis across websites, task categories, and UI elements. Our evaluation reveals that current models suffer from limited autonomous exploration capabilities to reliably solve website security and privacy tasks, and struggle with specific task categories and websites. Crucially, we identify stateful UI elements are a primary reason for agent failure, with toggles causing more than 45% task failure across many models.

17.
arXiv (CS.AI) 2026-06-11

What Limits Does Quantization Place on Dense Top-$k$ Retrieval? A Theoretical Study

arXiv:2606.11780v1 Announce Type: cross Abstract: We establish conditions for embedding a corpus of $N$ documents as $d$-dimensional vectors such that every $k$-subset $S \subseteq [N]$ is realizable as a result of top-$k$ retrieval by some query vector. Recent work shows that $d = O(k)$ suffices for such embeddings to exist in $\mathbb{R}^d$, independently of $N$. We theoretically prove that this corpus-independent bound is specific to infinite precision. With $B$ bits per coordinate, perfect top-$k$ retrieval requires $Bd = \Omega(k \ln N)$; thus, at any fixed precision, the dimension must grow at least logarithmically with $N$. Specializing to a $\ell_2$-normalized $B$-bit uniform scalar quantization model, we also identify a threshold on the precision $B^{*} = O(\ln \ln N)$ below which no dimension suffices, together with two further regimes that bound the feasible $(B, d)$ pairs. Our result implies that in practical vector databases and dense retrieval systems where quantization is standard, the embedding dimension and possibly the precision must grow with the corpus size.

18.
arXiv (CS.CL) 2026-06-16

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation

We introduce HK-LegiCoST, a new three-way parallel corpus of Cantonese-English translations, containing 600+ hours of Cantonese audio, its standard traditional Chinese transcript, and English translation, segmented and aligned at the sentence level. We describe the notable challenges in corpus preparation: segmentation, alignment of long audio recordings, and sentence-level alignment with non-verbatim transcripts. Such transcripts make the corpus suitable for speech translation research when there are significant differences between the spoken and written forms of the source language. Due to its large size, we are able to demonstrate competitive speech translation baselines on HK-LegiCoST and extend them to promising cross-corpus results on the FLEURS Cantonese subset. These results deliver insights into speech recognition and translation research in languages for which non-verbatim or ``noisy'' transcription is common due to various factors, including vernacular and dialectal speech.

19.
bioRxiv (Bioinfo) 2026-06-11

DLDN-Bench: A Benchmark Framework for Deep Learning de Novo Peptide Sequencing in Proteomics

De novo peptide sequencing is an essential approach for analyzing mass spectrometry data because it enables the identification of novel peptides without relying on protein sequence databases. Recent advances in deep learning have substantially improved the performance of de novo sequencing methods, but the rapid emergence of new models has led to heterogeneous evaluation practices and limited comparability. To address this, we introduce DLDN-Bench, a benchmark framework including a set of benchmark datasets derived from human muscle biopsy mass spectrometry data retrieved from PRIDE and annotated through consensus across multiple widely used database search engines. Using these datasets, we systematically benchmark recent deep learning-based de novo sequencing tools alongside traditional approaches. Performance is assessed using established metrics, including precision and coverage relative to a pseudo-ground truth defined by cross-engine agreement. To demonstrate the utility of DLDN-Bench, we benchmark four recent deep learning models and make all results publicly available. This benchmark framework provides a standardized basis for comparing state-of-the-art methods and offers an extensible resource for evaluating future tools in de novo peptide sequencing.

20.
arXiv (quant-ph) 2026-06-11

Enhancing Many-Body Chaos via Entropy Injection from Environment

arXiv:2606.11784v1 Announce Type: new Abstract: In closed quantum systems, local information spreads throughout the entire system and becomes highly complex under unitary evolution. In contrast, when the system is embedded in an environment, system-environment coupling can transfer information from the system into the environment, thereby reducing the rate of complexity growth within the system. This leads to the environment-induced scrambling transition established in previous works. In this work, we identify entropy injection from the environment as a different physical process that instead enhances many-body chaos. Our setup consists of coupling a system that is already in equilibrium with one environment to another environment, which serves as an entropy reservoir and drives the system into a non-equilibrium state. When entropy flows into the system through either heat transfer or particle transfer, the effective Hilbert space explored by the system enlarges, a mechanism that can enhance many-body chaos. We explicitly demonstrate this idea by constructing a solvable complex Brownian SYK model, in which both the relaxation toward the steady state and the steady-state quantum Lyapunov exponent can be computed analytically. Our results provide a controllable mechanism for tuning quantum scrambling through entropy flow in quantum many-body systems coupled to environments.

21.
arXiv (CS.AI) 2026-06-16

Z-Plane Neural Networks: Bounded Geometric Activation Replaces ReLU and LayerNorm

arXiv:2606.15669v1 Announce Type: cross Abstract: Modern deep neural networks rely on Euclidean scalar activations (e.g., ReLU) and global normalization techniques (e.g., LayerNorm) to prevent gradient instability in deep architectures. However, these mechanisms inherently cause dead neurons, discard critical directional information, and destroy the orthogonality of feature representations. Inspired by the frequency-modulation transmission of biological axons, we propose the Z-Plane Neural Network, which maps hidden states into 2D phasor bundles on a hypersphere. We introduce a novel geometric activation function, Radial Bounding($\mathbf{x} / \max(1, \|\mathbf{x}\|_2)$), which limits the energy magnitude while preserving the phase (direction). We demonstrate mathematically that this isotropic activation maintains 1-Lipschitz continuity and prevents gradient vanishing by preserving tangential gradients. Empirically, a 100-layer Z-Plane Multi-Layer Perceptron (MLP)-entirely devoid of ReLU and LayerNorm-successfully converges on the MNIST dataset with 98.34% accuracy and absolute numerical stability, proving that bounded geometric activation alone is sufficient for stable deep learning.

22.
arXiv (CS.LG) 2026-06-19

Optimal Ansatz-free Hamiltonian Learning In Situ

arXiv:2606.19486v1 Announce Type: cross Abstract: Characterizing the features of a Hamiltonian that governs a quantum system serves as a fundamental subroutine of quantum device calibration, signal sensing, and error correction. Recent works proposed protocols have achieved the optimal Heisenberg-limited scaling learning ansatz-free Hamiltonians from their real-time evolutions without fully specifying interaction structures. However, these protocols rely on both deep circuits with interleaving probes and control, and extremely short time resolution, making them difficult to implement on near- and intermediate-term in situ quantum experiments. In this work, we propose a computationally efficient, control-free, and ancilla-free algorithm that uses only Pauli product state preparation and measurement, and learns an ansatz-free Hamiltonian $H$ with $||H||\leq\Lambda$ in total evolution time of $\Theta(\frac{\Lambda}{\epsilon^2}\log(\frac{\Lambda}{\epsilon}))$. The evolution time cost of our algorithm is optimal for any control-free protocols as we further prove a lower bound of $\Omega(\frac{\Lambda}{\epsilon^2}\log(\frac{\Lambda}{\epsilon}))$. Technically, our method introduces a randomized-sampling framework that combines band-limited kernel-based time sampling with a displacement sieve for Hamiltonian structure learning. The characteristic probe time resolution depends only on $\Lambda$ instead of $\varepsilon$, which makes our protocol especially appealing in the high-precision regime for sensing and calibration applications. We also show that the algorithm maintains the same asymptotic total evolution time in the presence of state-preparation-and-measurement (SPAM) noise when the Hamiltonian is local after calibration. Our results demonstrate the fundamental cost of experimentally friendly Hamiltonian learning and provide a practical route to rigorous in situ characterization of near-term quantum platforms.

23.
arXiv (CS.CV) 2026-06-19

CMDS-AD: Cross-Modal Dual-Stream Decoupling for Few-Shot Anomaly Detection

Few-shot anomaly detection remains challenging due to limited training data. Multi-modal anomaly detection (MAD) offers a viable solution, leveraging 3D geometric cues to enrich 2D RGB representations and compensate for this scarcity. However, existing MAD methods apply spatially uniform feature processing, conflating stable macroscopic structures with high-frequency localized defect signals, exacerbating cross-modal misalignment and inflating false-positive rates. To overcome this, we present CMDS-AD, a Cross-Modal Dual-Stream Anomaly Detection framework. A LoRA-guided diffusion model generates diverse RGB samples to mitigate extreme data scarcity. For 3D normal augmentation, we employ a pre-trained diffusion model as a normal estimator. Crucially, this estimator inherently acts as a non-linear low-pass filter, directly extracting low-frequency normal representations from RGB inputs. This establishes an auxiliary estimated stream of purely low-frequency information, anchoring robust structural templates and assisting the uncompressed real stream, containing coupled high- and low-frequency components, to precisely isolate micro-defects. A Coordinate-Aware Hierarchical Feature Mapper adaptively aligns cross-modal semantics, while a multiplicative scoring mechanism filters modality-specific noise. Under the extreme 1-shot setting, CMDS-AD achieves absolute performance gains of 5.7% (I-AUROC) and 2.0% (AUPRO) on MVTec 3D-AD, alongside 7.7% and 5.6% improvements on EyeCandies, establishing a new state-of-the-art.

24.
arXiv (CS.CV) 2026-06-12

CD-RCM: Generalizable Continuous-Depth Novel View Synthesis for Reflectance Confocal Microscopy

Reflectance confocal microscopy (RCM) provides noninvasive, cellular-resolution "optical biopsies" of human skin in vivo by acquiring en-face images at successive depths, forming a sparse z-stack. Due to optical limitations, these stacks are anisotropic 3D volumes with lateral resolution (0.5 $\mu$m) $\sim$6 times higher compared to axial resolution, which is defined by the optical sectioning (3 $\mu$m), limiting the interpretation of tissue. Our goal is to provide continuous-depth visualization by interpolating intermediate sections and making the 3D volume isotropic. Such a representation permits arbitrary-direction sectioning, including histopathology-like cross-sectional examination, without requiring per-patient optimization. To that end, we introduce the first RCM-specific novel-view synthesis (NVS) approach, CD-RCM, a feedforward model that predicts realistic, unseen depths from sparsely sampled RCM stacks. Classical neural rendering methods focus on reconstruction from surface-level multi-view observations. In contrast to surface-level camera views, RCM can acquire optically sectioned en-face images of tissue beyond the surface up to 200 $\mu$m. However, during visualization of the RCM stacks, observations of the shallower sections (towards the surface) obscure the deeper ones. This unique axial imaging geometry and layer-dependent anatomical organization motivated our development of a tailored architectural and training framework that explicitly accounts for RCM's depth-resolved, occlusive imaging physics. Experiments demonstrate that CD-RCM achieves high-fidelity novel-view synthesis with sub-second inference time.

25.
arXiv (quant-ph) 2026-06-16

Grid-state deformation in a no-jump non-Hermitian bosonic dimer

arXiv:2606.17036v1 Announce Type: new Abstract: We study the no-jump evolution of ideal grid states in a lossy bosonic dimer with differential decay. The effective non-Hermitian quadratic dynamics induces a complex symplectic flow in phase space that deforms both the primitive lattice vectors and the origin seed. The average decay rate controls common attenuation, while coherent hopping and differential decay control the reduced dimer deformation. The reduced sector contains elliptic, parabolic, and hyperbolic regimes with imaginary spectra, an exceptional point, and real spectra, producing oscillatory, linear, and exponential lattice deformations. Although projected lattice areas can change, the deformation comes from a determinant-one complex symplectic flow on the full four-dimensional phase space. For a Gaussian regularization of the origin seed, we derive the associated complex width matrix and identify the positivity conditions that preserve Gaussian form. For an initial two-mode qunaught product state, the lossless limit recovers the standard beam-splitter generation of a square GKP$+$ Bell pair, while the no-jump dynamics produces its non-Hermitian deformation with a postselection cost set by the no-jump probability.