Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.CV) 2026-06-15

C-MambaPose: A Physics-Informed Complex Mamba Framework for Cross-Environment WiFi Human Pose Estimation

Authors:

Human pose estimation (HPE) utilizing wireless WiFi signals has emerged as a promising technology owing to its device-free nature, privacy preservation, and robustness against occlusion and poor lighting. However, existing methods often overlook the physical complex phase information of WiFi signals and fail to generalize across diverse environments due to severe domain shifts. In this paper, we present C-MambaPose, a physics-informed complex-valued Mamba-GraFormer hybrid framework for robust cross-environment WiFi-based 3D HPE. Our framework first sanitizes raw WiFi Channel State Information (CSI) phase errors and constructs a phase-preserving complex-valued representation. We then employ a Spatiotemporal Complex Mamba encoder with a dynamic selective receptive field to capture fine-grained phase dynamics. A cross-attention joint-query mapper maps the unstructured sequence tokens to human joints, which are decoded by a Graph Convolutional Network (GCN) to predict anatomically coherent 3D coordinates. Extensive evaluations on the MM-Fi dataset show that C-MambaPose achieves competitive or superior performance to state-of-the-art baselines across all settings, setting a new state-of-the-art specifically on the challenging cross-environment split, requiring only 3.78 M parameters-an 83.1\% reduction compared to GraphPose-Fi[chen2026graph] and an 85.7\% reduction compared to MetaFi++[zhou2023metafi++], while maintaining a comparable size to DT-Pose[chen2025towards] (which is only 18\% smaller) but achieving significantly superior performance without requiring any pretraining. Our code is publicly available at https://github.com/phucngvinuni/cmampose.git.

02.
PLOS Medicine 2026-06-01

The NIH 2025 Public Access Policy: Immediate access, unequal costs

by Caitlin R. Ryus, Caroline Raymond King, Edward R. Melnick The NIH 2025 Public Access Policy eliminates embargo periods for federally funded research, expanding who can read science. Yet without addressing article processing charges and market concentration, the policy risks creating new barriers to who can afford to perform and publish their science. In this Perspective, Caitlin Ryus and colleagues discuss the NIH 2025 Public Access Policy, highlighting that while expanding who can read science, the policy risks creating new barriers to who can afford to perform and publish their science.

03.
arXiv (CS.AI) 2026-06-19

TelcoAgent: A Scalable 5G Multi-KPM Forecasting With 3GPP-Grounded Explainability

arXiv:2606.19821v1 Announce Type: new Abstract: Key Performance Measurement (KPM) forecasting is essential for proactive network management of 5G and next-generation telecom networks. However, existing machine learning (ML) approaches face significant limitations in scalability and explainability, restricting their effectiveness in real-world deployments. We propose TelcoAgent, a foundation model-based framework that enables accurate, scalable, and explainable forecasting of multiple KPMs across diverse network cells without the need for site-specific training. Specifically, the framework comprises three key components: (i) an automated three-agent pipeline that constructs a 3rd Generation Partnership Project (3GPP) knowledge graph directly from specification documents, (ii) a scalable, time-series foundation model (TSFM)-based prediction pipeline to deliver accurate, zero-shot forecasting, and finally (iii) a reasoning and explanation pipeline that provides actionable, domain-grounded diagnostics. Evaluated using a 3-month, real-world, city-scale 5G KPM dataset from a U.S.-based network operator, TelcoAgent demonstrates high forecasting accuracy for all 7 considered KPMs per cell across 200 cells, while delivering explainable insights and actionable instructions to address network degradations.

04.
arXiv (CS.CL) 2026-06-18

Trust Region On-Policy Distillation

On-Policy Distillation (OPD) is a fundamental technique for efficient post-training of large language models (LLMs), with broad applications in agent learning, multi-task enhancement, and model compression. However, OPD training becomes unstable when the teacher and student distributions differ substantially, as teacher supervision on student-generated tokens may yield unreliable policy gradients and even cause optimization failure. This work addresses reliable on-policy token-level supervision through credit assignment strategies, and proposes Trust Region On-Policy Distillation, TrOPD. It features the following characteristics: 1) Trust-Region On-Policy Learning: TrOPD performs OPD only in regions where the teacher provides reliable supervision, mitigating the optimization difficulty of the K1 reverse-KL estimator under distribution mismatch. 2) Outlier Estimation: For outlier regions, we explore gradient clipping, masking, and forward-KL estimation to reduce the adverse effects of unreliable supervision. 3) Off-Policy Guidance: The student continues generation from teacher prefixes and uses forward KL to imitate off-policy guidance, encouraging on-policy exploration toward reliable regions. Experiments show that TrOPD consistently outperforms SoTA OPD baselines, including OPD, EOPD, and REOPOLD, across mathematical reasoning, code generation, and general-domain benchmarks.

05.
arXiv (math.PR) 2026-06-17

Analysis of the asymmetric shelf shuffle

arXiv:2606.18047v1 Announce Type: new Abstract: In an asymmetric shelf shuffle, a deck of $n$ cards is dealt sequentially from the bottom and assigned one of the $m$ shelves uniformly at random. The card is placed at the top of the assigned shelf with probability $p$, and at the bottom of the assigned shelf with probability $(1-p)$. Analysis of the shelf shuffle has gained much attention recently, and the case $p=1/2$ was first treated by Diaconis–Fulman–Holmes [Ann. Appl. Prob. 23 (2013), no. 4, 1692–1720]. In this paper, we extend the analysis of the shelf shuffle to general $p\in (0, 1)$. In particular, we study the distribution of cycles, cycle lengths, number of descents, number of valleys, number of inversions, and the RSK shape of a permutation obtained from an asymmetric shelf shuffle. Our results extend the analysis of Diaconis–Fulman–Holmes to arbitrary $p$. Furthermore, our analysis of the distribution of descents and inversions is new even for $p=1/2$.

06.
arXiv (CS.CV) 2026-06-19

Contour-Constrained Deformable Registration with Parameter Characterization for Head and Neck Surgical Guidance

With 890,000 annual new cases globally, head and neck squamous cell carcinoma has one of the highest recurrence rates among solid malignancies. Although frozen section analysis is the standard of care for intraoperative margin assessment, accurately relocating detected positive margins on the resection bed remains challenging due to imprecise alignment between resected specimens and their resection bed, compounded by post-resection mucosal tissue shrinkage. We present a biomechanics-driven deformable registration framework that corrects post-resection tissue deformation to provide intraoperative guidance. Our approach registers 3D specimen meshes to intraoperative resection bed point clouds using a deformable registration approach based on regularized Kelvinlet basis functions. The registration matches surface point clouds, fiducial landmarks, and boundary contour constraints that directly penalize perpendicular distance-to-agreement between specimen and resection bed boundaries. Across nine specimens from skin, buccal mucosa, and tongue sites, the overall mean target registration error was $11.11 \pm 4.07$ mm using rigid registration, which decreased to $8.20 \pm 2.68$ mm (26.19\% reduction) using deformable registration without contour constraint. The proposed contour-constrained deformable registration further reduced the error to $5.62 \pm 2.28$ mm, a 49.41\% reduction relative to rigid registration. We observed the largest reduction in the most clinically challenging tongue specimens. We also performed a systematic two-stage parameter search to characterize the relative importance of surface alignment, fiducial correspondences, contour constraint, and strain energy regularization. This search revealed that contour weighting dominates registration accuracy for tissue types with large lateral deformation, while the algorithm operates over a broad range of parameter combinations.

07.
arXiv (quant-ph) 2026-06-16

High-fidelity two-qubit gates in a 7-qubit register for quantum networks

arXiv:2606.14847v1 Announce Type: new Abstract: Quantum networks based on optically active solid-state spins may enable quantum technologies including long-range quantum communication and distributed quantum computing. Network nodes containing multiple high-fidelity qubits can facilitate large-scale fault-tolerant operation. However, the stringent error thresholds remain out of reach for multi-qubit registers. In this work, we demonstrate high-fidelity two-qubit gates in a 7-qubit register, based on nuclear spins coupled to a nitrogen-vacancy (NV) center in diamond. We analyze crosstalk in highly connected spin systems, develop an efficient optimization procedure, and characterize the gates using gate set tomography. The two-qubit gate fidelities (best: 99.61(5)%, average: 99.18(2)%) demonstrate a multi-qubit register at the threshold for distributed quantum computation. Finally, as an example application, we perform a variational quantum eigensolver (VQE) simulation of the ground-state energy of H2 and LiH molecules. These results demonstrate one of the key prerequisites for scalable quantum networks based on solid-state spins.

08.
arXiv (quant-ph) 2026-06-15

Reaffirming a Challenge to Bohmian Mechanics

arXiv:2509.06584v4 Announce Type: replace Abstract: In our recent work, we reported the first measurement of the speed of tunnelling particles using a coupled waveguide system. The measured speed is operationally defined through a comparison of two orthogonal motions in a coupled waveguide system, is compatible with the standard definition of dwell time and with the Büttiker-Landauer tunnelling time, and does not presuppose a trajectory picture. Here we respond to objections raised in comments, referee reports, preprints, and articles. We distinguish two questions that are often conflated: whether Bohmian mechanics reproduces the measured density, and whether the standard guiding equation assigns the correct state of motion to the particles. The first point follows under the usual quantum equilibrium assumptions. The second is a separate physical assumption, since the standard guiding equation does not follow from the Schrödinger equation alone. We argue that, in the evanescent regime, the state of motion assigned by the standard guiding equation is in disagreement with the measured speed. To make the distinction explicit, we also present a bidirectional Bohmian model that reproduces the same stationary density while assigning finite speeds compatible with the speed inferred in the evanescent regime.

09.
arXiv (CS.AI) 2026-06-19

ZeSTA: Zero-Shot TTS Augmentation with Domain-Conditioned Training for Data-Efficient Personalized Speech Synthesis

arXiv:2603.04219v2 Announce Type: replace-cross Abstract: We investigate the use of zero-shot text-to-speech (ZS-TTS) as a data augmentation source for low-resource personalized speech synthesis. While synthetic augmentation can provide linguistically rich and phonetically diverse speech, naively mixing large amounts of synthetic speech with limited real recordings often leads to speaker similarity degradation during fine-tuning. To address this issue, we propose ZeSTA, a simple domain-conditioned training framework that distinguishes real and synthetic speech via a lightweight domain embedding, combined with real-data oversampling to stabilize adaptation under extremely limited target data, without modifying the base architecture. Experiments on LibriTTS and an in-house dataset with two ZS-TTS sources demonstrate that our approach improves speaker similarity over naive synthetic augmentation while preserving intelligibility and perceptual quality. Audio samples are available on our web page.

10.
arXiv (CS.AI) 2026-06-11

Rule Taxonomy and Evolution in AI IDEs: A Mining and Survey Study

arXiv:2606.12231v1 Announce Type: cross Abstract: The adoption of AI-powered Integrated Development Environments (AI IDEs) has introduced "Rules" as a novel software artifact, allowing developers to persistently inject project-specific constraints and architectural guidelines into the context of Large Language Models (LLMs). Despite their role in aligning AI behavior with developer intent, the taxonomy, evolution, and practical impact of these rules remain largely unexplored. To bridge this gap, we conducted a mixed-methods empirical study on AI IDE rules. By mining 83 open-source projects and extracting 7,310 rules, we established a comprehensive taxonomy comprising 5 primary and 25 secondary categories. We then triangulated these artifacts with survey responses from 99 practitioners. Our analysis identified a contrast between developer priorities and actual configurations: while practitioners rate architectural constraints as highly important, rule files in repositories primarily consist of low-level workflow and code formatting constraints. Furthermore, our analysis of 1,540 rule evolution events revealed that rules are updated frequently. Repository data further indicate that rule evolution is primarily driven by constructive context expansions (29.17%) and enrichments (26.59%). In contrast, surveyed developers reported modifying rules primarily to correct AI errors (77.78%), typically by adding new negative constraints rather than editing existing ones. Finally, an artifact compliance assessment of 160 rule evolution events revealed that updating rules significantly improves the adherence of software artifacts, with the average artifact compliance rate increasing by 22.99% (from 49.14% to 72.13%) following an update. Our study provides empirical insights that can help developers optimize prompting strategies and guide tool builders in designing automated conflict-detection and context-management mechanisms for AI IDEs.

11.
arXiv (CS.CL) 2026-06-16

BALTO: Balanced Token-Level Policy Optimization for Hallucination Mitigation

Hallucinations remain a major obstacle to deploying large language models (LLMs) in knowledge-intensive settings, where generated responses must be faithfully grounded in provided evidence. Reinforcement learning (RL) is a promising direction for hallucination mitigation, but response-level faithfulness rewards suffer from a granularity mismatch: localized hallucinations can cause supported content to receive spurious penalties. Although recent work introduces fine-grained feedback such as claim-level verification and token-level rewards, unbalanced credit assignment can still induce length, verbosity, or optimization-noise biases. We propose BALTO, a Balanced Token-level Policy Optimization framework for hallucination mitigation. BALTO extracts checkable factual claims, verifies them against the reference context, and projects claim-level judgments to token-level labels. A balanced token-level credit assignment mechanism is introduced into the framework. This design redistributes probability mass from unsupported content toward faithful content, rather than suppressing the entire response. We systematically analyze the limitations of response-level rewards from a theoretical standpoint, and prove BALTO's advantages in training stability and optimization efficiency for hallucination mitigation. Experiments on ConFiQA, RAGTruth, and FinLLM-Eval show that BALTO achieves the highest faithfulness across all six model–benchmark settings and consistently outperforms existing post-training baselines in Q-Score, demonstrating a stronger faithfulness–informativeness trade-off.

12.
arXiv (CS.LG) 2026-06-16

Imbalanced Classification under Capacity Constraints

arXiv:2605.03289v2 Announce Type: replace-cross Abstract: Detecting observations from a minority class under severe class imbalance is a central challenge in applications such as fraud detection, medical screening, and industrial quality control. In these settings, each positive prediction triggers a costly follow-up action, an MRI scan, a transaction audit, whose execution is subject to real operational constraints. This paper proposes a formal classification framework under capacity constraints: given a user-defined bound limit $b$ on the proportion of observations that can be labeled as belonging to the minority class, the goal is to find the classifier that maximizes sensitivity on that class. We characterize the optimal classifier under this constraint and establish its equivalence with the classical Bayes classifier under a reweighting of the prior probabilities. We also introduce a capacity-adjusted performance metric $M$ that accounts for the effective detection rate when the capacity constraint is binding. The framework is implemented on top of standard learning methods, k-NN, SVM, random forests, and neural networks, and statistical consistency is established for each. We further show that these methods reduce to post-hoc thresholding when no hyperparameters are oriented toward the capacity-constrained objective, and introduce a capacity-aware support vector machine that exploits the constraint during training and achieves the strongest empirical performance. Experiments on the Taiwanese credit card default dataset confirm that capacity-constrained classifiers substantially outperform both classical approaches and SMOTE under high imbalance regimes. The framework extends naturally to multiclass settings and online environments.

13.
arXiv (quant-ph) 2026-06-16

Controlled Quantum Metrology with Anisotropic Heisenberg Spin Interactions under Intrinsic Decoherence

arXiv:2606.16918v1 Announce Type: new Abstract: We theoretically investigate quantum parameter estimation in a two-qubit anisotropic Heisenberg spin system with Dzyaloshinskii-Moriya (DM) interaction in the presence of intrinsic decoherence described by the Milburn model. Using the Quantum Fisher Information (QFI), we study the estimation of both the uniform magnetic field and the DM interaction strength. Analytical expressions for the time-evolved density matrix are obtained and used to explore the effects of exchange anisotropy, intrinsic decoherence, and probe-state preparation on the achievable estimation precision. Our results show that suitable tuning of the anisotropic exchange coupling and the initial entangled state can considerably enhance the estimation performance, with different optimal parameter regimes emerging for magnetic-field and DM-interaction sensing. To better understand the role of quantum resources in metrology, we also examine the behaviour of concurrence, quantum coherence, and von Neumann entropy. Overall, our findings demonstrate that anisotropic Heisenberg spin systems with DM interaction provide a promising and flexible platform for high-precision quantum metrology even in the presence of intrinsic decoherence.

14.
arXiv (CS.CV) 2026-06-11

CellNet – Localizing Cells using Sparse and Noisy Point Annotations

Counting living cells is an important step in many biological research workflows. Our collaborators at the Wellcome Sanger Institute study vital genes in humans via large scale saturation genome editing screening, which requires repeatedly counting cells a great number of times. Computer Vision based automation is crucial for high throughput and resource efficiency. In this work, we develop a regression-based deep learning computer vision algorithm to detect and count cells in phase-contrast microscopy images. To reduce annotation effort, which in practice often becomes a bottleneck, we focus on counting cells only using sparse point annotations, which are fast and easy to acquire. By comparison to state-of-the-art 0-shot methods, we show that regression-based counting is a promising alternative in low data regimes. Through developing methods to automatically count living cells in microscopy images, we contribute to valuable research on the human genome. The code is available at https://github.com/beijn/cellnet.

15.
arXiv (CS.CL) 2026-06-16

Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework

Internet memes have become a dominant form of expression on social media, including within the Bengali speaking community. While often humorous, memes can also be exploited to spread offensive, harmful, and inflammatory content targeting individuals and groups. Detecting this type of content is exceptionally challenging due to its satirical, subtle, and culturally specific nature. This problem is magnified for low-resource languages like Bengali, as existing research predominantly focuses on high-resource languages. To address this critical research gap, we introduce Bn-HIB (Bangla Hate Inflammatory Benign), a novel dataset containing 3,247 manually annotated Bengali memes categorized as Benign, Hate, or Inflammatory. Significantly, Bn- HIB is the first dataset to distinguish inflammatory content from direct hate speech in Bengali memes. Furthermore, we propose the MCFM (Multi-Modal Co-Attention Fusion Model), a simple yet effective architecture that mutually analyses both the visual and textual elements of a meme. MCFM employs a co-attention mechanism to identify and fuse the most critical features from each modality, leading to a more accurate classification. Our experiments show that MCFM significantly outperforms several state-of-the-art models on the Bn-HIB dataset, demonstrating its effectiveness in this nuanced task. To facilitate reproducibility and future research, the Bn-HIB dataset has been made publicly available through Mendeley Data. Warning: This work contains material that may be disturbing to some audience members. Viewer discretion is advised

16.
Nature (Science) 2026-06-10

Mitochondria directly interact with the nuclear pore complex

Mitochondria regulate cellular processes through direct and indirect interactions with other organelles. A well-studied example has been contact with the endoplasmic reticulum at mitochondrial-associated endoplasmic reticulum membranes1, which control pathways including redox and calcium homeostasis2,3. Recent studies have also reported direct mitochondria–nuclear membrane contacts in cancer cells and yeast that promote pro-survival signalling4,5. Here we identify direct interactions between mitochondria and nuclear pores. Using two unbiased proteomic screens, GST pulldown and BioID, we found that VDAC1 was the top mitochondrial candidate that interacts with the filamentous nuclear pore protein RANBP2. In vitro RANBP2 CRISPR knockout, RANBP2 truncation or site-directed mutagenesis of RANBP2–VDAC1 interacting amino acids resulted in reduced mitochondria–nucleus proximity and decreased nuclear ATP and phosphocreatine levels. This was accompanied by a decline in the levels of the nuclear phosphoproteome and downregulation of pathways involved in histone modification, cellular differentiation and transcriptional regulation in vitro. Moreover, deletion of the RANBP2 C-terminal domain in vivo in mice resulted in embryonic lethality due to cardiac and neural crest differentiation defects. Collectively, these results describe a mechanism by which mitochondria directly interact with the nuclear pore complex, a phenomenon critical for regulation of nuclear energetics and cellular differentiation. Undoubtedly, additional roles of this interaction remain to be revealed. Mitochondria interact directly with the nuclear pore complex via VDAC1–RANBP2 binding to sustain nuclear ATP levels.

17.
arXiv (CS.AI) 2026-06-16

An AI Security Agent for University ACMIS: Multi-Vector Threat Detection and Automated Response

arXiv:2606.08270v2 Announce Type: replace-cross Abstract: University Academic Management Information Systems (ACMIS) are high-value targets for a wide spectrum of security threats including brute-force login attacks, payment fraud, privilege escalation, insider data theft, and academic integrity violations. Traditional rule-based intrusion detection systems are inadequate because many malicious activities are structurally indistinguishable from normal operations. This paper presents an AI-based security agent for ACMIS that combines supervised anomaly detection, behavioural analytics, and a natural language processing chatbot for secure password recovery. The agent monitors five operational layers: authentication, authorisation, financial transactions, user behaviour, and system health, and responds through a four-tier risk escalation framework. A modular architecture allows the core engine to be extended to other institutional systems. Experiments on a simulated ACMIS event log dataset of 147,922 sessions demonstrate a threat detection macro-average F1 of 0.966, compared to 0.156 for a rule-based baseline and 0.836 for a sequence-only (LSTM) baseline, with end-to-end critical-tier automated response latency under 1 ms on a single-node prototype. The integrated recovery chatbot achieves 97.1 percent identity verification accuracy and an 87.3 percent mass-reset attack detection rate with zero false positives on legitimate high volume recovery periods.

18.
arXiv (quant-ph) 2026-06-19

Benchmark of quantum algorithms for ground state preparation in the presence of noise

arXiv:2606.20551v1 Announce Type: new Abstract: We compare the performance of representative cooling, adiabatic, and optimization algorithms for ground-state preparation in the presence of noise. Using an exactly solvable family of quadratic fermionic Hamiltonians subject to depolarizing noise, we derive the scaling of the achievable relative energy as a function of the noise rate and support these results with numerical simulations. The Hamiltonian exhibits two phases, separated by a quantum phase transition. As expected, the performance of the different algorithms depends on the phase: adiabatic evolution is favorable in the trivial phase, while a multi-frequency cooling algorithm, as proposed in [1], becomes competitive or superior in the topological phase, where gap-closing limits adiabatic protocols. We further present numerical results for the quantum approximate optimization algorithm [2], showing that it performs competitively with cooling in the trivial phase but is typically outperformed in the topological regime. Finally, we show that for this model the cooling protocol exhibits enhanced robustness to parameter imperfections, highlighting its potential advantage for realistic implementations of noisy quantum state preparation. The analytical approach developed here, in conjunction with numerical validation, establishes an extendable approach to benchmarking ground-state preparation algorithms.

19.
medRxiv (Medicine) 2026-06-17

Short-term relaxation after cervical rotatory manipulation is more closely associated with somatosensory input than cracking sound: a randomized controlled EEG study

Background Cervical rotatory manipulation is commonly used for neck-related symptoms and is often accompanied by a cracking sound. This sound is frequently regarded as a sign of successful manipulation, but whether it contributes substantially to the immediate relaxation response remains unclear. Objective This study examined whether short-term relaxation after cervical rotatory manipulation is more closely related to manipulation-associated sensory input than to the cracking sound cue alone. Methods In this single-session, three-arm, parallel randomized controlled study, 54 healthy volunteers were allocated to cervical rotatory manipulation, sham manipulation, or sham manipulation plus simulated cracking sound. Subjective outcomes were assessed before and after intervention, including positive affect, negative affect, comfort, and satisfaction. Eyes-closed resting-state electroencephalography was recorded before and after intervention. Prespecified neural outcomes included frontal alpha power, frontal alpha/beta ratio, occipital individual alpha frequency, and alpha-band fronto-parietal and fronto-temporal functional connectivity. Results Cervical rotatory manipulation produced greater improvements in positive affect, comfort, and satisfaction than sham manipulation or sham manipulation plus simulated cracking sound, whereas negative affect remained generally stable across groups. These subjective responses were accompanied by short-term electroencephalography changes, particularly in frontal alpha/beta and alpha-band fronto-parietal and fronto-temporal functional connectivity. Changes in frontal alpha/beta ratio were positively associated with changes in positive affect. In contrast, simulated cracking sound alone did not reproduce the full subjective or electroencephalography response observed after real manipulation. Conclusions The immediate relaxation response after cervical rotatory manipulation appears to be more closely related to manipulation-associated sensory input than to the cracking sound cue alone. These findings provide preliminary neurophysiological evidence for distinguishing real manipulation effects from sound-related contextual cues.

20.
arXiv (CS.AI) 2026-06-19

Triangular Consistency as a Universal Constraint for Learning Optical Flow

arXiv:2606.19938v1 Announce Type: cross Abstract: We propose triangular consistency as a first-principled constraint for optical flow, which is agnostic to network architecture, supervision type, and dataset, and applies to both image-pair and multi-frame settings. This simple but powerful constraint is to compose two flows to induce a third flow and enforce consistency among the three. The composed flows may arise from (i) image pairs, yielding cycle consistency; (ii) multiple video frames, producing longer-range motion through temporal chaining; or (iii) image pairs combined with controlled synthetic transformations, which becomes data augmentation. This triangular consistency introduces negligible computational overhead and requires no additional annotations. Since it is derived directly from the geometry of optical flow, it does not rely on model-specific assumptions and serves as a ``universal'' plug-and-play component for optical flow training. Experiments show consistent improvement across supervised, unsupervised, and transfer learning settings.

21.
arXiv (CS.CL) 2026-06-18

Human-AI Coevolution Dynamics: A Formal Theory of Social Intelligence Emergence Through Long-Term Interaction

Current conversational AI systems have made significant progress in language generation, personalization, and long-context interaction. However, most existing methods model social behavior through isolated components such as emotion modeling, memory retrieval, or persona conditioning, lacking a unified framework to explain the emergence of stable social relationships and social intelligence in long-term human-AI interaction.To address this, we propose the Human-AI Coevolution Dynamics Framework (HACD-H), a formal model of human-AI interaction as a self-organizing social cognitive system. HACD-H integrates emotional adaptation, relational organization, social memory, and personality consistency into a unified dynamical framework and introduces principles including multi-timescale social cognition, relational attractors, trust basins, developmental phase transitions, and social cognitive energy dynamics.We construct a conversational dataset with approximately 14,700 interaction turns and develop a theory-driven empirical evaluation framework. Results reveal a hierarchy of temporal persistence in social cognition, stable relational attractors, phase-transition-like developmental patterns, and a structured social cognitive energy landscape. Social intelligence shows a significant negative correlation with social cognitive energy (r = -0.391, p < 0.001), and interaction trajectories exhibit progressive energy reduction over time.These findings suggest that social intelligence emerges from long-term social cognitive coevolution rather than isolated conversational capabilities. HACD-H provides a unified theoretical foundation for modeling adaptive human-AI social interaction and developing socially intelligent AI systems.

22.
bioRxiv (Bioinfo) 2026-06-11

TMO: ASYMMETRIC CROSS-MODAL ATTENTION FOR LEARNINGCELL-STATE-DEPENDENT REGULATORY LAGS FROM SINGLE-CELL MULTIOMIC DATA

Abstract Background: Single-cell multi-omics technologies simultaneously measure chromatin accessibility (ATAC) and gene expression (RNA), providing a unique window into the temporal ordering of regulatory events during differentiation. However, most computational models treat the two modalities symmetrically, ignoring the directional relationship between chromatin and transcription, and existing lag-aware methods estimate a single global lag per gene, failing to capture cell-state-dependent dynamics. Methods and Results: We introduce Temporal Multi-Omics (TMO), a deep learning framework that learns signed, cell-state-conditional regulatory lags ({Delta}{tau}) using asymmetric cross-modal attention. TMO projects RNA and ATAC into 50 latent components each, tokenises each cell as a sequence of 100 tokens, and uses a two-pass transformer in which a data-driven lag prior - derived from a sliding-window cross-correlation function - directly biases attention asymmetrically. On four independent 10x Multiome datasets (mouse brain, human brain, mouse kidney, human PBMC), the asymmetric model achieves Lag Concordance Scores (LCS) of 0.988-0.999, compared to 0.048-0.108 for an architecturally identical symmetric baseline. A stratified 80/20 held-out experiment confirms that the learned component-lag ordering generalises to unseen cells (held-out LCS 0.85-0.99). Clustered {Delta}{tau} heatmaps show positive {Delta}{tau} (ATAC-led priming) in early pseudotime and negative {Delta}{tau} (RNA-led, activity-dependent regulation) in late pseudotime; the ATAC-RNA correlation heatmap exhibits a U-shaped pattern indicative of developmental decoupling. Components with the most positive {Delta}{tau} are enriched for chromatin organization and stem cell differentiation (FDR < 0.05), while those with the most negative {Delta}{tau} are enriched for synaptic signalling and immune activation. Ablating the cell-state information from the lag predictor reduces the LCS and collapses per-component temporal dynamics (KS p [&le;] 0.039 in all four tissues), proving that TMOs dynamic lag patterns depend on cell-state conditioning. Independent ChIP-seq validation for four transcription factors (PAX5, Pax6, ASCL1, Hnf4) confirms highly significant separation between target genes and expression-matched background (p < 10-4 in all cases). Two Multiome Perturb-seq screens provide causal validation: SMARCB1 knockout shows a directional trend (1.5-fold target shift, p = 0.056, n = 147 perturbed cells), and SMARCE1 knockout reaches statistical significance (p = 0.0089, n = 3,394 perturbed cells). Gene-level cross-correlation independently validates that the regulatory lag signal is present in the raw data, and TMO further identifies rare, statistically significant biphasic gene programs where the regulatory direction reverses across pseudotime. Conclusions: TMO is the first method to make regulatory lag a learnable, cell-state-conditional, and architecturally encoded parameter. It is scalable, interpretable, and open-source, providing a powerful tool for studying regulatory timing in development, disease, and perturbation screens.

23.
arXiv (CS.CL) 2026-06-11

Measuring Semantic Progress in Multi-turn Dialogue via Information Gain

Evaluating multi-turn dialogue is challenging because quality emerges across turns rather than within individual responses. We focus on a key dimension of information-seeking dialogue: semantic progress, defined as the accumulation of new, question-relevant, and non-redundant information over the course of a conversation. We formalize semantic progress as question-conditioned uncertainty reduction and introduce an information-theoretic metric that approximates it in embedding space. Our main estimator uses a tractable Gaussian formulation with closed-form updates, while a complementary maximum-entropy argument shows why log-determinant structure arises more broadly when only second-order embedding information is retained. This formulation yields desirable theoretical properties, including monotonicity, additive decomposition of total information gain across turns, and diminishing returns for redundant evidence. Unlike LLM-as-a-judge approaches, our metric requires no autoregressive inference at evaluation time and is fully reproducible for a fixed embedding model. Experiments on MT-Bench, Chatbot Arena, and UltraFeedback show that the proposed metric achieves competitive agreement with human judgments despite targeting only semantic progress, with improved alignment on MT-Bench and UltraFeedback compared to several LLM-based judges. Notably, the method remains effective with lightweight embedding models under CPU-only execution, indicating that semantic progress can be captured without reliance on large model capacity.

24.
arXiv (quant-ph) 2026-06-11

Quantum thermodynamics, quantum correlations and quantum coherence in accelerating Unruh-DeWitt detectors in both steady and dynamical state

arXiv:2512.18123v2 Announce Type: replace Abstract: We investigate the interplay between quantum thermodynamics, quantum correlations, and quantum coherence within the framework of the Unruh-DeWitt (UdW) detector model. By analyzing both the steady and dynamical states of various quantum resources (including steerability, entanglement, quantum discord, and coherence), we study how these resources evolve under Markovian and non-Markovian environments. Furthermore, we investigate the impact of both the Unruh temperature and the energy levels on three key quantum phenomena: thermodynamic evolution, quantum correlations, and quantum coherence, considering different initial state preparations. The hierarchical structure relating quantum correlations and quantum coherence is determined. We further examine the thermodynamic performance of a quantum heat engine, highlighting the influence of memory effects and classical correlations on heat exchange, work extraction, and efficiency. Our results reveal that non-Markovian dynamics can enhance the preservation of quantum correlations and improve the engine's efficiency compared to purely Markovian regime. These findings provide insights into the role of quantum correlations and quantum coherence in quantum thermodynamic processes and open avenues for optimizing quantum devices operating in relativistic or open-system settings.

25.
arXiv (CS.CL) 2026-06-11

Redesign Mixture-of-Experts Routers with Manifold Power Iteration

Router is the cornerstone component to the Mixture-of-Experts models. Serving as expert proxies, the rows of the router matrix compute their similarity to the MoE inputs to determine which subset of experts is activated. Ideally, each router row is designed to encode the expert matrix into this representative vector, such that its dot-product with token can better reflect token-expert affinity. However, there exists no design principles to enforce this condensation. In this paper, we propose to align each router row with the principal singular direction of the associated expert, as this direction provides the most expressive mathematical description of a matrix. Based on this principle, we propose a router redesign with Manifold Power Iteration (MPI). Specifically, it introduces a "Power-then-Retract" paradigm, where a power iteration step is performed on the router weights, followed by a retraction to impose a norm constraint to ensure both efficiency and stability. Theoretically, we show that MPI drives router rows to converge toward the principal singular directions of associated experts. Empirically, we pretrain MoE model across scales from 1B to 11B parameters to confirm that this alignment facilitates more effective MoE models.