Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CL) 2026-06-15

Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

作者:

Large language models often fail to satisfy formatting instructions when they must simultaneously perform demanding tasks. We study this behaviour through a prospective memory inspired lens from cognitive psychology, using a controlled paradigm that combines verifiable formatting constraints with benchmark tasks of increasing complexity. Across three model families and over 8,000 prompts, compliance drops by 2-21% under concurrent task load. Vulnerability is highly type-dependent: terminal constraints (requiring action at the response boundary) degrade most, with drops up to 50%, while avoidance constraints remain comparatively robust. A salience-enhanced format (explicit instruction framing plus a trailing reminder) recovers much of the lost compliance, restoring performance to 90-100% in many settings. Interference is bidirectional: formatting constraints can also reduce task accuracy, with one model's GSM8K accuracy dropping from 93% to 27%. In additional stacking experiments, joint compliance declines sharply as constraints accumulate. All results use deterministic programmatic checkers without an LLM-as-judge component on publicly available datasets.

02.
arXiv (CS.CL) 2026-06-12

Trait, Not State: The Durability of Reading Identity in Social Highlighting

Prior work on a social web highlighter located individuality in selection – which documents a person chooses to highlight – but measured it cross-sectionally. We ask the temporal question: is a reader's selection signature a trait or a state? We freeze each reader's first six months of highlighting as a profile and track its own-vs-other advantage on their later selections at growing gaps (to 24+ months), with negatives drawn from the same calendar era – so supply drift cannot masquerade as personal drift – at a coarse global level and at a fine level whose negatives and controls come from the reader's own interest neighborhood; the anchor cell reproduces the prior cross-sectional level (+0.188 vs +0.169), validating the harness. Four results. Within the same users, the fine-layer advantage shows no statistically detectable paired decline at any horizon (6-12 month retention R = 1.00 [0.85, 1.18], n = 212; the farthest bin is compatible with a modest decline; the only contrast whose interval excludes zero is the coarse layer at 12-24 months, about 13%). The signal is not reducible to repeated domains (~90% survives excluding all profile sources). Within-person drift is slow (a recent-half profile beats the old half by +0.042). Prospectively, personal profiles – even one built from a reader's earliest documents, median 20 months before evaluation – rank their next reads at roughly 3x the AP of every simple non-personal prior tested. We use "trait" operationally (a stable signature under continued engagement); the scope is heavy, long-tenured readers of one platform, and exposure is not separable from choice.

03.
arXiv (math.PR) 2026-06-11

Martingale Solutions to a Stochastic Keller-Segel System with nonlocal Source and Super-linear Noise

arXiv:2606.11774v1 Announce Type: new Abstract: Global nonnegative martingale solutions are shown to exist for a stochastic Keller-Segel system with a nonlocal Fisher-KPP source and super-linear multiplicative noise. The result is obtained for nonnegative initial data with no smallness assumption, provided that the nonlocal source term is dominant. The main difficulty stems from the absence of a coercive structure and the super-linear nature of the noise. An additional cut-off with finite L^2 norm in the classical Galerkin method is added to establish a well-posed approximation problem. Moreover, due to the nonlocal Fisher-KPP structure, it is necessary to prove the positivity of the approximating solution in order to obtain uniform estimates. In the compactness arguments, the usual tightness argument in the framework of Hilbert spaces cannot be directly applied to the uniform estimates obtained in this paper. As a result, we develop a more general version of the compactness argument and tightness criterion, presented in the appendix, which will be applied throughout the paper. This allows for the global existence of nonnegative martingale solutions to be derived from Jakubowski's version of the Skorokhod Theorem, along with a thorough discussion of the convergence properties.

04.
arXiv (CS.AI) 2026-06-24

Themis: An explainable AI-enabled framework for Reinforcement Learning with Human Feedback

arXiv:2606.24622v1 Announce Type: new Abstract: Training safe Reinforcement Learning (RL) systems is inherently challenging, with no guarantee of avoiding unwanted behaviors. The most effective defenses against this are (i) transparency through explainability and (ii) alignment via human feedback. While both show promising results, no publicly available framework currently combines them. To address this, we introduce Themis, an XAI-enabled testing and evaluation framework for Reinforcement Learning from Human Feedback. Themis supports over 200 widely used environments and is easily configurable for experiments in RL, transparency, and alignment. Our results show that Themis can train reward models that match or outperform the environment's true reward signal using human preferences. We also provide a cloud-based platform for collecting human feedback and managing experiments. It is user-friendly, auto-scalable, and supports large participant groups across multiple experiments without extra development overhead. Tests show Themis can support one thousand users in back-to-back experiments on a modest commercial machine.

05.
arXiv (CS.LG) 2026-06-16

Evolutionary Bilevel Reward Shaping for Generalization in Reinforcement Learning

arXiv:2606.16236v1 Announce Type: new Abstract: Reinforcement learning (RL) often suffers from performance degradation when deployed in environments that differ from those encountered during training. Existing techniques such as domain randomization (DR) mitigate this, but require access to diverse training environments and full trajectory observability, assumptions that fail in privacy-preserving or restricted scenarios where only scalar performance metrics are available. We propose Generalization via Evolutionary Reward Shaping (GERS), a bilevel optimization approach to improve generalization on unseen test environments using only scalar feedback from validation environments. At the lower level, an RL agent guided via a reward function shaped by the upper level learns a policy on a limited set of training environments with accessible trajectory data; at the upper level, CMA-ES optimizes the reward shaping parameters to maximize the cumulative unshaped reward on separate validation environments for which trajectory access is unavailable. Results on continuous control tasks indicate that GERS outperforms the standard RL baseline on unseen test environments. GERS performance is comparable to DR, despite DR treating the combined set of training and validation environments of GERS as a single training set that requires trajectory access, whereas GERS cannot access validation trajectories. These results confirm that GERS effectively enhances generalization under restricted data access constraints.

06.
arXiv (CS.CL) 2026-06-15

SIMMER: Benchmarking Latent Failures in LLM Executable Planning with a World Model

Large language models (LLMs) are increasingly deployed as planners for autonomous agents in household environments. While existing benchmarks evaluate whether LLM-generated plans execute successfully, they overlook a critical type of failure: latent failures. Unlike immediate failures that trigger instant feedback at execution time and enable timely correction, latent failures do not immediately halt plan execution but silently compromise goal achievement. In severe cases, they cause irreversible harm. To address this gap, we introduce SIMMER, a benchmark for evaluating latent failures in LLM planning through a human-curated symbolic world model grounded in the kitchen domain. SIMMER defines a world model comprising 77 actions, 262 unique objects, and approximately 46,800 possible interactions that are semantically realistic, derived from real-world cooking scripts. It then leverages a state machine executor that validates plans against the world model and detects immediate precondition violations, latent hazards, and irreversible failures. Experiments across six LLMs show that even frontier models achieve at most 17% error-free plans. Moreover, up to 56% of plans contain latent failures, the majority of which lead to irreversible consequences. We further demonstrate that explicit state reasoning via counterfactual foresight simulation can reduce latent failures by up to 72% and irreversible cases by up to 75%, suggesting a promising direction for more robust LLM planners.

07.
arXiv (CS.LG) 2026-06-19

Optimal Ansatz-free Hamiltonian Learning In Situ

arXiv:2606.19486v1 Announce Type: cross Abstract: Characterizing the features of a Hamiltonian that governs a quantum system serves as a fundamental subroutine of quantum device calibration, signal sensing, and error correction. Recent works proposed protocols have achieved the optimal Heisenberg-limited scaling learning ansatz-free Hamiltonians from their real-time evolutions without fully specifying interaction structures. However, these protocols rely on both deep circuits with interleaving probes and control, and extremely short time resolution, making them difficult to implement on near- and intermediate-term in situ quantum experiments. In this work, we propose a computationally efficient, control-free, and ancilla-free algorithm that uses only Pauli product state preparation and measurement, and learns an ansatz-free Hamiltonian $H$ with $||H||\leq\Lambda$ in total evolution time of $\Theta(\frac{\Lambda}{\epsilon^2}\log(\frac{\Lambda}{\epsilon}))$. The evolution time cost of our algorithm is optimal for any control-free protocols as we further prove a lower bound of $\Omega(\frac{\Lambda}{\epsilon^2}\log(\frac{\Lambda}{\epsilon}))$. Technically, our method introduces a randomized-sampling framework that combines band-limited kernel-based time sampling with a displacement sieve for Hamiltonian structure learning. The characteristic probe time resolution depends only on $\Lambda$ instead of $\varepsilon$, which makes our protocol especially appealing in the high-precision regime for sensing and calibration applications. We also show that the algorithm maintains the same asymptotic total evolution time in the presence of state-preparation-and-measurement (SPAM) noise when the Hamiltonian is local after calibration. Our results demonstrate the fundamental cost of experimentally friendly Hamiltonian learning and provide a practical route to rigorous in situ characterization of near-term quantum platforms.

08.
arXiv (quant-ph) 2026-06-16

Interaction-enabled topological pumping of Rydberg electrons

arXiv:2606.15126v1 Announce Type: cross Abstract: Topological pumping is a paradigmatic realization of quantized transport in band systems, yet its fate in strongly correlated regimes, especially with long-range interactions, remains largely unexplored. Here we report the experimental observation of interaction-enabled topological pumping of correlated Rydberg electrons in a synthetic lattice. We show that dipolar exchange interactions induce a controllable shift of the underlying topological singularity in parameter space, such that a fixed pumping trajectory can be driven through successive topological transitions by tuning the interaction strength alone. This leads to the emergence and breakdown of quantized transport. The observations are consistent with an effective Rice-Mele description with interaction-renormalized onsite potentials and are supported by characterizing the adiabaticity and robustness to control trajectory imperfections. Our results establish a platform for exploring interaction-controlled topological transport beyond perturbative regimes and open a route toward engineering correlated topological matter in synthetic quantum systems.

09.
medRxiv (Medicine) 2026-06-24

Uncovering the fitness of endemically circulating Zika virus strains

Zika virus (ZIKV) is an arbovirus that usually causes few symptoms and has circulated endemically in Asia for decades. However, a large outbreak in South America in 2015 uncovered the serious risk of congenital Zika syndrome in infants born from ZIKV infected mothers. It is unknown whether a lineage with distinct pre-existing fitness advantage emerged from Asia to cause the South American outbreak, and whether there is ongoing evolution that can result in future globally fit strains. Here we used 107 sequences from a single setting (Thailand) collected over an 18 year period (2006-2023). We used novel analytical tools to identify distinct lineages that have circulated in the population and estimated their relative epidemiological fitness. We found there have been six lineages circulating sequentially in the country, with regular emergence and replacement of lineages showing higher fitness than their predecessors. We identified 15 lineage-defining amino acid changes, including four well-documented fitness-enhancing mutations, and two UTR substitutions. The lineage that emerged in South America was evolutionarily linked to the highest-fitness lineage in Thailand, carrying seven of our lineage-defining substitutions acquired during endemic circulation there, and subsequently accumulating four additional changes. After the global pandemic, endemic ZIKV in Thailand continued to evolve, with newly emerged lineages showing novel mutations and increased fitness. Our findings have key implications for the monitoring of ZIKV and can help identify the pathway to increased transmissibility of this globally important pathogen.

10.
arXiv (CS.CL) 2026-06-11

Can News Predict the Market? Limits of Zero-Shot Financial NLP and the Role of Explainable AI

Can financial news reliably predict short-term stock movements? Despite advances in large language models, this question remains unresolved. We revisit this problem using a zero-shot natural language processing framework, investigating whether models can extract actionable signals from financial news without domain-specific training. We design a structured pipeline that combines zero-shot natural language inference with temporal aggregation, explicitly modelling recency and event-dependent impact horizons when integrating information across articles. To address the need for transparency in high-stakes settings, we introduce a multi-layered explainability framework that links predictions to token-level, article-level, and aggregate evidence, and produces grounded natural language rationales. Across multiple models and prediction horizons, we find that zero-shot approaches consistently fail to outperform simple baselines, with particularly weak performance on negative movements, suggesting deeper structural limitations in mapping news sentiment to short-term price dynamics. However, explainability signals reliably distinguish between trustworthy and unreliable predictions, offering practical value even when accuracy is limited. These findings highlight the limits of zero-shot financial NLP and motivate a shift toward decision-support systems that prioritise transparency and uncertainty awareness. Code: https://github.com/alimert05/zero-shot-stock-xai

11.
arXiv (CS.CL) 2026-06-19

PsyScore: A Psychometrically-Aware Framework for Trait-Adaptive Essay Scoring and ZPD-Scaffolded Feedback

Effective Automated Essay Scoring (AES) are expected to support both reliable assessment and actionable instructional feedback. However, existing approaches often treat scoring and feedback as separate components: neural scoring models provide limited interpretability, while Large Language Model (LLM)-based feedback is typically insensitive to learners proficiency levels. To address this fragmentation, this work proposes PsyScore, a psychometrically-aware framework that integrates diagnostic assessment with instructional scaffolding through a shared latent ability representation. PsyScore comprises three key modules: a Trait-Adaptive Neural IRT Scorer that incorporates the Graded Partial Credit Model (GPCM) into a neural architecture, enabling the precise estimation of student ability while maintaining psychometric interpretability, a ZPD-Scaffolded Feedback Generator, which conditions multi-agent feedback strategies on the diagnosed ability parameter to adapt instructional focus across different proficiency levels, and a Multi-Perspective Feedback Evaluation Strategy that assesses feedback quality via pairwise preference judgements and student revision simulations. Experiments on the ASAP++ dataset demonstrate that PsyScore achieves competitive scoring performance while providing more pedagogically aligned feedback.

12.
arXiv (CS.AI) 2026-06-18

A Variational Framework for LLM Generator-Regulator Games

作者:

arXiv:2606.18424v1 Announce Type: cross Abstract: This paper develops a variational framework for regulated language generation. Starting from autoregressive token sampling, we derive the induced distribution over complete messages and relate it to an entropy-regularized Gibbs law. Regulation is modeled as an optimal discriminator whose convex-dual value is an f-divergence, and the generator-regulator interaction is formulated as a saddle-point problem. The framework applies to moderation, censorship, AI deception detection, compliance auditing, phishing defense, and manipulation control, where regulation concerns a distribution over possible messages rather than a single output. The equilibrium clarifies the tradeoff among utility, entropy, regulatory alignment, and finite-length detectability. Two finite-vocabulary case studies, censorship filtering and phishing defense, illustrate how the theory can be evaluated through utility, entropy, divergence, receiver-side scores, and detection probability.

13.
arXiv (math.PR) 2026-06-19

Establishing an $\Omega(\sqrt{d})$ complexity lower bound for PDMP samplers and how to break it: a sub-$\sqrt{d}$ algorithm for Gaussian-tailed targets

arXiv:2606.19909v1 Announce Type: cross Abstract: Despite the theoretical appeal of their non-reversibility, to date, no Piecewise Deterministic Markov Process (PDMP) samplers have been developed that scale better than $\mathcal{O}(\sqrt{d})$ in computational complexity with respect to the target dimension $d$. We prove that this is a fundamental limitation by establishing an $\Omega(\sqrt{d})$ lower bound on the algorithmic complexity of PDMP samplers in a standard setup. By relaxing the assumption that the target density must remain invariant at all continuous times, we then demonstrate how to bypass this barrier. Specifically, we introduce a novel PDMP sampling scheme and show that it achieves an empirical complexity of $\mathcal{O}(d^\alpha)$, where $\alpha \in [0.2, 0.3]$ for Gaussian-tailed targets. In addition, this PDMP scheme is locally adaptive in both trajectory length and distance between velocity updates.

14.
arXiv (CS.CV) 2026-06-11

Learning Instance-Adaptive Low-Rank Orthogonal Subspaces for Clothes-Changing Person Re-Identification

Clothes-changing person re-identification (CC-ReID) aims to recognize individuals despite drastic appearance changes caused by clothing variation. While existing methods rely on adversarial learning to disentangle clothing features, we propose Ortho-ReID, which explicitly models a low-rank clothing subspace from VLM text descriptions and extracts clothing-invariant representations via direct geometric constraints. A critical component is our transformer-based Basis Maker, which refines a shared, low-dimensional clothing prior into an instance-adaptive low-rank subspace through cross-attention with image patches, enabling robust clothing feature extraction even under varying visibility conditions. This instance-adaptive subspace is supervised via alignment with clothing text embeddings, while identity features are extracted via a learnable projection head and geometrically constrained to be strictly orthogonal to it. Extensive experiments demonstrate state-of-the-art performance on PRCC (+5.9% top-1), Celeb-reID-light (+3.5%), and LaST (+5.3%), with competitive results on LTCC.

15.
arXiv (CS.LG) 2026-06-12

Viral Proteins Reveal Geometry of Protein Language Models

arXiv:2606.12609v1 Announce Type: new Abstract: Protein language models are trained on highly imbalanced datasets, raising the question of how they represent underrepresented biological sequences. Using viral proteins as a case study across ESM model families, we identify a dominant nativeness axis in embedding space, aligned with masked reconstruction perplexity, that orders sequences from well-modeled cellular proteins through viral proteins to shuffled and random sequences. Scaling contracts this axis unevenly across viral families. Despite this, protein language model embeddings retain viral-specific signal: viral proteins remain linearly separable beyond zero-shot perplexity and shallow sequence features. Together, these results suggest that pLM representations are structured by a general notion of nativeness while preserving information specific to distinct biological groups.

16.
bioRxiv (Bioinfo) 2026-06-15

AliceDB database and pipeline for identification of natural protein variants based on mass spectrometry measurement data

The natural variation that distinguishes living organisms within a single species is currently being studied intensively, primarily at the genetic level. Unfortunately, studies of natural variants at the level of protein gene products are not very common, mainly due to the lack of appropriate databases and bioinformatics tools. The main research technique used to study proteomes/peptidomes is mass spectrometry (MS). A classic method for interpreting raw mass spectrometry data in proteomic/peptidomic studies involves the use of databases containing representative (canonical) sequences that define the proteome of the organism under study. In this paper, we present the AliceDB database, which contains information on over 7 million natural variants of protein sequences described in the scientific literature for Homo sapiens. The data contained in the AliceDB database can be utilized using widely available and commonly used software for interpreting proteomic data. Test results regarding the use of the AliceDB database for the interpretation of proteomic data indicate that accounting for the presence of natural variants increases both the number and quality of identified proteins. Furthermore, it is easy to identify protein sequence variants that may, for example, be of significance in medicine.

17.
arXiv (CS.AI) 2026-06-12

What Type of Inference is Active Inference?

arXiv:2606.04935v2 Announce Type: replace Abstract: Active inference casts decision-making as inference, with the Expected Free Energy (EFE) unifying goal-directed and information-seeking behavior. Recent work showed that EFE minimization can be written as Variational Free Energy (VFE) minimization on a generative model augmented with epistemic priors. We prove that the VFE of the augmented model can be rewritten as the VFE of the predictive model plus explicit entropy-correction terms, making the EFE contribution transparent. We then show that proper EFE-based planning requires combining these epistemic corrections with a planning correction that turns marginal inference into policy optimization, yielding a full variational characterization of EFE-based planning. This clarifies which corrections are needed for cross-entropy planning and for full EFE-based planning. The same entropy-corrected formulation leads to a detailed message-passing scheme for EFE-based planning together with simpler ablations. Experiments on three grid-world environments show that full EFE-based planning outperforms ablations that omit either the planning correction or the epistemic corrections.

18.
arXiv (CS.CL) 2026-06-11

Kuramoto Attention: Synchronizing Self-Attention on the Torus

We introduce Kuramoto attention, a self-attention layer in which each hidden coordinate is an angle. The layer scores tokens by gated cosine similarity, attends over previous phase states, and updates each token by the tangent component of the attention-weighted circular mean. Because the values are the raw phase states, this update is exactly the Kuramoto coupling term $\sum_u A_{t,u}\sin(\theta_u-\theta_t)$, with the attention matrix acting as an adaptive, content-dependent coupling kernel. Equivalently, the gated score is a learned metric on the torus that selects which tokens couple, and the update pulls each token toward the circular mean of the tokens it selects, tightening their phase agreement. The same two ingredients, an invariant similarity score and an on-manifold mean, define such a layer on any compact group; the torus is the abelian case, where both are closed-form. The softmax weights solve an entropy-regularized phase-retrieval problem, and rotary position enters as a position-dependent phase drift in the score. On enwiki8 character-level language modeling, the layer trains as a functional language model whose bits-per-character stays close to a strong matched RoPE+SwiGLU transformer: within $0.02$ BPC at one million parameters ($1.637\pm0.010$ versus $1.616\pm0.004$) and level on the median at five million ($1.448$ versus $1.452$ over five seeds) with the transformer ahead on the mean ($1.468$ versus $1.456$). These experiments establish that the constrained geometric structure is a viable language model at this scale; the structure itself, and its synchronization reading, is the contribution. Ablations isolate the load-bearing components, and the result gives a compact bridge between self-attention and phase synchronization.

19.
arXiv (CS.AI) 2026-06-24

LLM-MINE: Large Language Model based Alzheimer's Disease and Related Dementias Phenotypes Mining from Clinical Notes

arXiv:2603.13673v2 Announce Type: replace Abstract: Accurate extraction of Alzheimer's Disease and Related Dementias (ADRD) phenotypes from electronic health records (EHR) is critical for early-stage detection and disease staging. However, this information is usually embedded in unstructured textual data rather than tabular data, making it difficult to be extracted accurately. We therefore propose LLM-MINE, a Large Language Model-based phenotype mining framework for automatic extraction of ADRD phenotypes from clinical notes. Using two expert-defined phenotype lists, we evaluate the extracted phenotypes by examining their statistical significance across cohorts and their utility for unsupervised disease staging. Chi-square analyses confirm statistically significant phenotype differences across cohorts, with memory impairment being the strongest discriminator. Few-shot prompting with the combined phenotype lists achieves the best clustering performance (ARI=0.290, NMI=0.232), substantially outperforming biomedical NER and dictionary-based baselines. Our results demonstrate that LLM-based phenotype extraction is a promising tool for discovering clinically meaningful ADRD signals from unstructured notes.

20.
arXiv (quant-ph) 2026-06-24

High-harmonic generation driven by temporal-mode quantum states of light

arXiv:2512.06602v2 Announce Type: replace Abstract: We develop a theoretical framework for high-harmonic generation (HHG) driven by quantum states of light based on a temporal-mode expansion of the electromagnetic field. This approach extends previous single plane-wave mode treatments to realistic pulse configurations and arbitrary multi-mode states of light, resolving conceptual inconsistencies arising from non-normalizable infinite plane waves and establishing consistency between analytical and numerical methods. We derive a correction factor that quantifies deviations from the diagonal approximation (in which the yield becomes a statistical average over classical-field simulations) both for the response of a single atom and in the many-atom regime. Our results confirms that the HHG spectrum for atoms driven by any quantum state of light in free space is accurately described by averaging semi-classical calculations over the Husimi distribution, with no observable genuine quantum effects in the spectrum. We also demonstrate that in the many-atom regime, the mean-field coherent-state approximation underlying this treatment does not preserve probabilities, although unitarity is restored by in the diagonal approximation. The absence of genuine quantum effects in the HHG yield is attributed to the large photon numbers ($\sim 10^{11}$) required to reach HHG intensities in free space, which render quantum fluctuations negligible. We discuss nanophotonic environments with ultrasmall mode volumes as potential platforms where few-photon strong-field processes could exhibit genuine quantum signatures.

21.
arXiv (CS.CV) 2026-06-17

TerraTransfer: Learning End-to-End Driving Policies Without Expert Demonstrations

End-to-end autonomous driving has achieved state-of-the-art performance on benchmarks and real-world deployments. Its standard training recipe, however, is expensive across all stages: collecting and labeling millions of driving frames is costly, and closed-loop RL on images is bottlenecked by the per-step cost of photorealistic rendering plus a forward pass through a large vision backbone. Self-play in vectorized simulators changes the economics: millions of rollout steps per second, and a state distribution naturally rich in collisions, near-misses, and recoveries that no driving log contains. Our approach exploits this asymmetry by decoupling learning to drive from learning to see. We pretrain a single policy by self-play, then align its latent space with a pretrained vision backbone, through the action KL divergence and a batch-relational low-rank structural loss. The action target comes from the self-play policy, so alignment never supervises against a logged trajectory: a paired dataset of (image, scene-state) frames suffices, with no need for the curated expert demonstrations that imitation pretraining is built on. On photorealistic 3D Gaussian splatting closed-loop scenarios, the resulting end-to-end policy matches or exceeds prior end-to-end methods.

22.
arXiv (CS.CV) 2026-06-15

FBSDiff++: Improved Frequency Band Substitution of Diffusion Features for Efficient and Highly Controllable Text-Driven Image-to-Image Translation

With large-scale text-to-image (T2I) diffusion models achieving significant advancements in open-domain image creation, increasing attention has been focused on their natural extension to the realm of text-driven image-to-image (I2I) translation, where a source image acts as visual guidance to the generated image in addition to the textual guidance provided by the text prompt. We propose FBSDiff, a novel framework adapting off-the-shelf T2I diffusion model into the I2I paradigm from a fresh frequency-domain perspective. Through dynamic frequency band substitution of diffusion features, FBSDiff realizes versatile and highly controllable text-driven I2I in a plug-and-play manner (without need for model training, fine-tuning, or online optimization), allowing appearance-guided, layout-guided, and contour-guided I2I translation by progressively substituting low-frequency band, mid-frequency band, and high-frequency band of latent diffusion features, respectively. In addition, FBSDiff flexibly enables continuous control over I2I correlation intensity simply by tuning the bandwidth of the substituted frequency band. To further promote image translation efficiency, flexibility, and functionality, we propose FBSDiff++ which improves upon FBSDiff mainly in three aspects: (1) accelerate inference speed by a large margin (8.9$\times$ speedup in inference) with refined model architecture; (2) improve the Frequency Band Substitution module to allow for input source images of arbitrary resolution and aspect ratio; (3) extend model functionality to enable localized image manipulation and style-specific content creation with only subtle adjustments to the core method. Extensive qualitative and quantitative experiments verify superiority of FBSDiff++ in I2I translation visual quality, efficiency, versatility, and controllability compared to related advanced approaches.

23.
arXiv (CS.LG) 2026-06-17

Differential Privacy of Gaussian Process Posterior Sampling

arXiv:2606.17995v1 Announce Type: cross Abstract: We study the privacy of releasing posterior sample paths from a Gaussian process (GP) when the entire training set including covariates and responses is private. Unlike standard differential-privacy (DP) mechanisms that add external noise, posterior sampling is random by construction. We show that this intrinsic randomness yields DP guarantees by deriving explicit Rényi-DP bounds for GP posterior sample-path release. The bounds separate posterior-mean leakage from data-dependent posterior-covariance leakage showing that meaningful privacy depends sharply on effective ridge regularisation. We apply membership-inference attacks to show that empirical leakage follows the predicted dependence on regularisation, posterior variance and the number of released posterior sample-paths. Utility experiments on downstream posterior-sampling tasks identify noisy-observation regimes where privacy-compatible regularisation preserves useful decisions with modest utility loss. When stronger privacy is needed, the intrinsic guarantee can be sharpened by adding calibrated GP noise, providing an explicit additional privacy knob.

24.
arXiv (CS.CL) 2026-06-12

LAUKIN: A Multi-jurisdictional Common Law Contract Dataset

Multinational companies increasingly require cross-jurisdictional contract review, yet existing legal NLP datasets are largely restricted to a single jurisdiction. We introduce LAUKIN (Legal equivalence dataset of Australia, UK, and INdia), a dataset of clause pairs (AU-UK, UK-IN, IN-AU) labelled for boolean legal equivalence. We develop a novel multi-stage retrieval and reranking pipeline to construct the initial clause pair mapping, with a subset of clause pairs subsequently annotated by legal experts as Equivalent or Not Equivalent. The dataset comprises 14,727 clause pairs from 204 contracts across 8 agreement types, of which 3,000 are manually labelled: 900 train, 600 dev, and 1,500 test. We evaluate 12 models across 4 techniques, achieving a best macro-F1 of 65.11%, establishing LAUKIN as a challenging benchmark. Results reveal that, despite shared legal heritage, drafting conventions diverge significantly across jurisdictions, making cross-jurisdictional equivalence classification non-trivial. LAUKIN also includes 11,727 unlabelled training pairs to support future semi-supervised learning research in legal NLP.

25.
arXiv (quant-ph) 2026-06-12

Geometric Algebra Quantum Gate Decomposition

arXiv:2606.12480v1 Announce Type: new Abstract: Quantum gates are usually described through matrix and tensor-product formalisms that often obscure their geometric structure. In this work, we formulate the Pauli and Clifford groups within the complex Geometric Algebra (GA) framework. We show that the Pauli group is naturally identified with the group of blades up to a global phase, thereby providing a geometric interpretation of Pauli operators and their commutation relations in terms of oriented subspaces. We further prove that Clifford operators are generated by products of {\pi}/4-Pauli rotors and introduce a greedy Pauli rotor decomposition algorithm whose empirical behavior suggests unexpectedly compact decompositions for Clifford operators. Finally, we show that Clifford+T universality admits a natural geometric interpretation through {\pi}/8-rotors within this framework.