Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (quant-ph) 2026-06-19

Purity and bound energy in ancilla-assisted work extraction

arXiv:2606.19945v1 Announce Type: new Abstract: We investigate ancilla-assisted work extraction in quantum batteries from the perspective of bound energy and purity. We show that the bound energy of the reduced system provides a tight upper bound to the daemonic gain and that this bound is saturated for globally pure system–ancilla states. Motivated by this relation, we introduce a purity-based gain that qualitatively predicts the daemonic gain without requiring explicit optimization over measurements. We further introduce a protocol to analyze the role of dissipation and intrinsic interactions on daemonic gain. Under a collective environment, dissipation can dynamically generate and stabilize finite daemonic gain through environment-induced correlations. In interacting systems, level crossings and spectral restructuring strongly modify the attainable gain through their influence on the accessible bound energy. Our results demonstrate that daemonic gain is governed not only by correlations, but also by the spectral structure of the underlying Hamiltonian and information loss captured by bound energy and purity.

02.
arXiv (CS.CV) 2026-06-16

Multi-Modal Spatio-Temporal Graph Neural Network with Mixture of Experts for Soil Organic Carbon Prediction

Top-soil organic carbon (SOC) prediction is fundamental to agricultural sustainability, land use policy and fertilization planning. Existing approaches face two limitations: they pair hand-crafted covariates with classical ML or single-modal deep models that miss rich spectral and temporal information, and grid-based architectures ignore the irregular spatial structure of field measurements. We introduce SpTGNN, a multi-modal spatio-temporal graph neural network addressing both. SpTGNN represents soil measurements as nodes in a heterogeneous graph with three edge types (spatial proximity, spectral similarity, elevation), and applies relational graph attention to learn separate patterns per relation. A fine-tuned TerraMind encoder extracts node features from Sentinel-2, Sentinel-1 and DEM signals, combined with per-sample environmental covariates and learned positional and temporal embeddings. A sparse Mixture-of-Experts module fuses the four streams via top-$k$ routing. Uncertainty is captured by pairing heteroscedastic regression (aleatoric) with deep ensembles (epistemic), and a Moran's $I$ penalty regularizes spatial autocorrelation. We evaluate on a global SOC corpus split into three regional instances ($\sim$49k samples globally, Africa $\sim$26k, Europe $\sim$14k). Our 5-member deep ensemble reports $R^2=0.762$, RMSE $=3.51\pm0.48$ g/kg and MAPE $=22.9\%$ on the Africa test split, improving over a tabular XGBoost baseline; the best single checkpoint reaches validation $R^2=0.864$. Ablations confirm the heterogeneous graph, MoE fusion and fine-tuned backbone each contribute substantively, and the ensemble UQ stack achieves post-calibration ECE of $0.031$ (hybrid) and $0.026$ ($\beta$-NLL). To our knowledge, this is the first framework to unify foundation-model feature extraction, heterogeneous graph attention and decomposed uncertainty quantification for SOC estimation.

03.
arXiv (quant-ph) 2026-06-17

Tunneling Dynamics and Time Delay in Electron Transport through Time-Dependent Barriers with Finite-Bandwidth Reservoirs

arXiv:2507.20649v2 Announce Type: replace-cross Abstract: We study a model system consisting of a tunneling barrier driven by an external harmonic field and coupled to two leads with finite bandwidth. Avoiding Floquet expansions, we derive simple expressions for the time-dependent tunneling current in the adiabatic regime. Our approach relates the barrier modulation to a measurable time delay in the steady-state periodic current. It provides a physically consistent definition of the tunneling time inside the barrier by subtracting the time delay associated with the leads from the total time delay. We find that the tunneling time always vanishes for wide/high barriers. Remarkably, the time delay persists even when the barrier becomes static, i.e., in the limit where the modulation frequency vanishes. This indicates that the time delay obtained through the introduction of an external periodic perturbation actually reflects an intrinsic property of the tunneling dynamics, rather than an effect of the external drive or of a particular system. We apply our results to the analysis of tunneling times in optical experiments and find good agreement with the experimental data.

04.
arXiv (math.PR) 2026-06-16

Small moments of the sensitivity of polynomial threshold functions

arXiv:2606.16004v1 Announce Type: new Abstract: In the first version of Chang, Slote, Volberg, and Zhang's paper [BSA_of_PTF], the authors modify a nice recursive approach due to Kane in [Correct_exponent_for_AS] where he bounded the average sensitivity of polynomial threshold functions. In [BSA_of_PTF] Kane's argument was adopted to estimate the boolean surface area of polynomial threshold function. The bridge is a combinatorial averaging lemma considering all balanced partitions. The lemma serves as a substitute for an additive property of average sensitivity. With the lemma, one can apply a Kane-type algorithm to derive a recurrence. Solving the recurrence then gives an upper bound of $e^{C_d \sqrt{\log n}}$ for the boolean surface area. In the second version of the same paper, the authors derive a polylog upper bound for BSA of PTFs. The difference is that they use a tail estimate for the sensitivity function. With the help of a polynomial restriction lemma in [poly_restriction] they sharpen the upper bound. It is noteworthy that when applying the polynomial restriction, each coordinate is put into each part independently with equal probability. As a result, a partition does not necessarily have equal-size blocks. In other words, it may not be balanced. In this note, we first investigate the effect of different partitioning. Second, we use the recursive method in the first version to derive a polylog upper bound for $\mathbb E[s(x)^{\eta}]$ where $\eta < 1/2$. It is interesting to note the phase transition that happens at $\eta=1/2$ in both versions of the proof (but in a completely different form). Section [PhaseTr-s] treats that.

05.
arXiv (CS.CL) 2026-06-15

Be My Tutor: On-Policy Co-Distillation for Mutual LLM Improvement via Peer Feedback

We study multi-domain LLM training in which two models, each stronger in a different domain, co-evolve by tutoring each other through on-policy feedback. Unlike one-way distillation or single-model fine-tuning, our goal is mutual Pareto improvement: each model improves across domains without losing its original strength. To this end, we propose On-Policy Co-Distillation (OPCoD), where each student's self-distillation is conditioned on its own correct rollout and feedback from its peer. To make feedback exchange effective, OPCoD uses cognizance-based gating to decide when to give feedback and feedback anchoring to ground feedback in the problem. On Science Q\&A tasks, OPCoD consistently outperforms baselines and achieves Pareto improvement across all evaluated domain pairs and students.

06.
arXiv (CS.CL) 2026-06-18

Retell, Reward, Repeat: Reinforcement Learning for Narrative Theory-Informed Story Retelling

Counterfactual story retelling exposes LLM shortcomings in constrained narrative solution spaces where they can no longer rely on recalling memorised training data. Ground-truth-based post-training, such as SFT, fails to teach LLMs how to generate logical and rational narrative events. In this paper, we introduce Retell, Reward, Repeat (RRR), an RL-based pipeline synthesising Structuralist Narratology with scalar narrativity to teach storytelling structure. We extend the TimeTravel dataset with human-annotated stages of narrative equilibrium to evaluate reward models. By using d-RLAIF, RRR derives training signals from the narrativity of textual features without the need for reference outputs. Evaluations demonstrate that RRR-trained LLMs outperform few-shot and SFT baselines in logic, rationality, and completeness, with output quality additionally validated by blind human preference. Relying on a small, query-only dataset, RRR provides a linguistically grounded, cost-effective post-training mechanism for storytelling–a domain currently lacking effective post-training methods. RRR highlights the continued relevance of integrating established linguistic theories into contemporary NLP.

07.
arXiv (CS.CV) 2026-06-18

CAOA – Completion-Assisted Object-CAD Alignment

Accurately aligning CAD models to their corresponding objects in indoor RGB-D scans is a central challenge in 3D semantic reconstruction. The task requires estimating a 9-Degree-of-Freedom (DoF) pose-position, rotation, and scale along three axes-but is hindered by noisy and incomplete scans, as well as segmentation errors that cause geometric distortions. We present Completion-Assisted Object-CAD Alignment (CAOA), a method that integrates a semantically and contextually aware point cloud completion module with a symmetry-aware relative pose estimation algorithm, enabling precise alignment of CAD models to scanned objects. Existing completion methods are typically trained and evaluated on synthetic datasets, which often fail to generalize to real-world scans. To bridge this gap, we introduce a synthetic data generation strategy tailored to indoor scenes, significantly reducing the synthetic-to-real domain gap-validated through quantitative comparisons with widely used completion datasets. In addition, we release S2C-Completion, an expert-annotated dataset of over 8,500 object-CAD pairs from Scan2CAD, created for real-world indoor single-object completion and intended as a new benchmark for this task. For object-CAD alignment, we incorporate symmetry information via a symmetry-aware loss, improving robustness to symmetric ambiguities. On the Scan2CAD benchmark, CAOA achieves a 17% accuracy improvement over state-of-the-art methods.

08.
arXiv (CS.CL) 2026-06-11

Cross-Layer Discrete Concept Discovery for Interpreting Language Models

Interpreting language models remains challenging due to the existence of residual stream, which linearly mixes and duplicates features across adjacent layers, causing single-layer analyses to miss this cross-layer structure. Cross-layer sparse autoencoders (SAEs) address layer mixing but operate in continuous space, where concepts split across many neurons without clear boundaries. We introduce Cross-Layer Vector Quantized-Variational Autoencoder (CLVQ-VAE), a novel framework which maps representations from a lower layer to a higher layer through a discrete vector-quantization bottleneck, collapsing duplicated residual-stream features into compact, interpretable concept vectors. Our approach combines top-k temperature-based sampling with exponential moving average (EMA) codebook updates, providing controlled exploration of the discrete latent space while maintaining codebook diversity. Across both encoder- and decoder-based models on ERASER-Movie, Jigsaw, and AGNews, CLVQ-VAE outperforms clustering, single-layer vector quantized-variational autoencoder (VQ-VAE), and sparse autoencoder (SAE) baselines across three evaluation axes: removing identified concepts drops model accuracy by up to 93%, LLM judges rank our concepts first in 66.7% of comparisons, and human annotators recover model predictions from our visualizations with 78% accuracy versus 54% for clustering.

09.
arXiv (math.PR) 2026-06-16

Large Deviations for the Nonlinear Schrödinger Equation with Randomized Quasi-Periodic Initial Data in Higher Dimensions: Subcritical Case

arXiv:2604.17253v2 Announce Type: replace Abstract: We study the cubic weakly nonlinear Schrödinger equation with randomized spatially quasi-periodic initial data in higher dimensions. Under a polynomial decay assumption in Fourier space, we establish a Large Deviations Principle for rogue waves in the so-called subcritical time regime. The proof proceeds in two main steps. We first characterize the distribution of the linear solution and establish the corresponding linear large deviations principle. The lower bound is obtained via pointwise estimates, while the upper bound follows from a combination of truncation and probabilistic arguments. {The method used in this step appears to be new; compare with [GGKS23].} We then perform a detailed combinatorial analysis of the Picard iteration, deriving an effective bound for the Duhamel term and thereby establishing the nonlinear large deviations principle.

10.
arXiv (quant-ph) 2026-06-19

Mitigating Trotter Errors via Post-Processed Symmetry Restoration

arXiv:2606.20242v1 Announce Type: new Abstract: Quantum simulation is a powerful tool for exploring complex quantum many-body systems such as condensed matter physics and gauge theories. Trotterization, which approximates the ideal time evolution operator by decomposing it into a sequence of local gate operations, is one of the most widely used quantum simulation algorithms. However, such Trotterized implementations generally fail to preserve the symmetries of the target Hamiltonian during compilation. As a result, they can drive quantum states out of symmetrically allowed subspaces, leading to unphysical dynamics and symmetry-violating algorithmic errors. In this work, we propose a symmetry-based Trotter error mitigation protocol using classical post-processing. By applying symmetry transformations to the initial state or interleaving them between discrete Trotter layers, and then averaging an ensemble of the resulting measurement outcomes via classical post-processing, our method systematically projects out the symmetry-violating components of the Trotter error while leaving the ideal dynamics unchanged. Importantly, this framework naturally accommodates non-local spatial symmetries and anti-unitary operations such as time reversal, which are difficult or impossible to implement directly with hardware-native quantum gates. We benchmark our protocol on the one-dimensional XY model and the one-dimensional Schwinger model. In the XY model, enforcing reflection symmetry suppresses the leading-order Trotter error, whereas in the Schwinger model, interleaving gauge transformations between Trotter layers enables gauge-twirling effectively to reduce unphysical violations of local Gauss's law. These results demonstrate that symmetry-based post-processing provides a depth-preserving route to substantially improving the fidelity of Trotterized quantum simulations on near-term devices.

11.
arXiv (CS.LG) 2026-06-12

Smarter Saboteurs, Better Fixers: Scaling & Security in Linear Multi-Agent Workflows

arXiv:2606.12709v1 Announce Type: cross Abstract: As LLM-based multi-agent systems (MAS) are deployed in the wild, the resilience of their collaboration structures against adversarial compromise becomes a critical safety concern. Attackers may leverage prompt-injection or jailbreaking to sabotage individual agents within MAS workflows, but the interaction between model scaling and system-level resilience remains poorly understood. This paper investigates how model scale affects the security of linear multi-agent workflows. Our experiments across scales of two open-weight model families on the HumanEval benchmark reveal a compliance-correction symmetry: larger models are far more likely to faithfully execute malicious instructions, with the control-to-malicious performance drop reaching 53.7pp at 27B in uncorrected pipelines. However, appending a lightweight terminal Fixer stage collapses this to 0.6pp and restores statistical parity with control-level performance, demonstrating that strictly linear collaboration structures can be viable and resilient to adversaries at this scale, and suggesting that the brittleness previously attributed to linear topology may stem from a lack of correction.

12.
arXiv (quant-ph) 2026-06-12

Path integral control of open quantum systems

arXiv:2410.18635v4 Announce Type: replace Abstract: We investigate open-loop quantum state preparation for a class of open quantum systems whose dynamics follow a Gorini-Kossakowski-Lindblad-Sudarshan (GKLS) master equation that admits a trajectory-based stochastic representation. The deterministic control objective is reformulated as a stochastic optimal control problem – interpreting stochasticity as a methodological tool akin to stochastic Schrödinger equation unravelings – which situates the problem within the path integral control framework. For the class of GKLS generators under consideration, this reformulation leads to an explicit expression for the optimal control as a weighted average over stochastic quantum trajectories, thereby eliminating the need for gradient evaluations. Building on this theoretical result, we derive a control update rule for piecewise-constant control pulses and demonstrate that adaptive importance sampling progressively enhances the control estimator during optimization, culminating in the algorithm we term Path integral Quantum Control (PiQC). We further introduce an annealed variant of PiQC, wherein a synthetic noise schedule gradually steers open-system trajectories toward closed-system dynamics, enabling high-fidelity unitary state preparation. Numerical studies on a dissipative single-qubit system and a multi-qubit Nuclear Magnetic Resonance model verify that PiQC yields precise open-loop controls and displays robustness to Hamiltonian perturbations. We propose PiQC as a trajectory-based alternative to gradient-based approaches, which might offer a viable solution in quantum control problems where gradient computation is infeasible or computationally demanding.

13.
arXiv (CS.AI) 2026-06-11

GPO: Learning from Critical Steps to Improve LLM Reasoning

arXiv:2509.16456v3 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used in various domains, showing impressive potential on different tasks. Recently, reasoning LLMs have been proposed to improve the reasoning or thinking capabilities of LLMs to solve complex problems. Despite the promising results of reasoning LLMs, enhancing the multi-step reasoning capabilities of LLMs still remains a significant challenge. While existing optimization methods have advanced the LLM reasoning capabilities, they often treat reasoning trajectories as a whole, without considering the underlying critical steps within the trajectory. In this paper, we introduce Guided Pivotal Optimization (GPO), a novel fine-tuning strategy that dives into the reasoning process to enable more effective improvements. GPO first identifies the `critical step' within a reasoning trajectory - a point that the model must carefully proceed to succeed at the problem. We locate the critical step by estimating the advantage function. GPO then resets the policy to the critical step, samples the new rollout and prioritizes the learning process on those rollouts. This focus allows the model to learn more effectively from pivotal moments within the reasoning process to improve the reasoning performance. We demonstrate that GPO is a general strategy that can be integrated with various optimization methods to improve reasoning performance. Besides theoretical analysis, our experiments across challenging reasoning benchmarks show that GPO can consistently and significantly enhance the performance of existing optimization methods, showcasing its effectiveness and generalizability in improving LLM reasoning by concentrating on pivotal moments within the generation process.

15.
arXiv (quant-ph) 2026-06-16

Hardy and Cabello Arguments in Spatial and Temporal Frauchiger-Renner Scenarios

arXiv:2606.15467v1 Announce Type: new Abstract: We investigate Hardy- and Cabello-type logical structures within spatial and temporal extensions of the Frauchiger–Renner (FR) framework, embedding these constructions directly into the FR multi-observer architecture. In the spatial multi-observer scenario, both Hardy and Cabello contradictions arise, with the Cabello construction yielding the stronger violation,$\(\Delta_Cabello^{\max}=0.1078\)$, which exceeds the maximal Hardy probability $\(P_{H}^{\max}=\frac{5\sqrt{5}-11}{2}\approx 0.09017\)$. We then develop a sequential temporal FR protocol based on coherent multi-observer measurements performed on a single spin-$\tfrac12$ system. In this temporal setting, the Hardy contradiction disappears identically due to dynamical constraints imposed by sequential state updates, whereas a finite Cabello-type violation survives, \(\Delta_Cabello^{\max}\approx 0.0674\). Our results establish a fundamental structural distinction between spatial entanglement and temporal multi-observer correlations in FR-type logical scenarios, and demonstrate that certain observer-independent description failures persist even without spacelike separation.

16.
arXiv (CS.CV) 2026-06-12

EvTexture++: Event-Driven Texture Enhancement for Video Super-Resolution

Event-based vision has drawn increasing attention owing to its distinctive properties, including ultra-high temporal resolution and extreme dynamic range. Recent works have introduced it to video super-resolution (VSR) to enhance flow estimation and temporal alignment. In contrast, this paper shifts the focus of event signals from motion refinement to texture enhancement in VSR. We propose EvTexture++, the first event-driven framework dedicated to texture enhancement in VSR. It leverages high-frequency spatiotemporal details from events to improve texture recovery. EvTexture++ incorporates a customized texture enhancement branch, along with an iterative texture enhancement module that progressively exploits high-temporal-resolution event information for texture restoration. This enables gradual refinement of texture regions across iterations, yielding more accurate and detailed high-resolution outputs. Besides intra-frame texture recovery, large motions could degrade inter-frame temporal consistency, particularly in texture regions, leading to texture flickering. To mitigate this, we further exploit the continuous-time motion cues of events to enhance temporal consistency, introducing a temporal texture alignment module that estimates event-guided texture-aware flow for precise inter-frame texture alignment. Moreover, EvTexture++ is designed as a plug-and-play tool to flexibly boost the performance of existing VSR models. Experiments on five datasets demonstrate that EvTexture++ achieves state-of-the-art performance. When integrated into recent VSR models, it yields significant improvements, with gains of up to 1.55 dB in PSNR on the texture-rich Vid4 dataset. Code: https://github.com/DachunKai/EvTexture.

17.
arXiv (CS.CL) 2026-06-17

TACOMORE: Exploring a replicable prompting protocol for LLM-assisted corpus analysis

As corpus linguistics continues to scale, researchers are facing a growing methodological bottleneck: while computational tools can easily count billions of words, the qualitative interpretation of these data remains a slow and labor-intensive human task. Large Language Models (LLMs) offer a promising way to automate this process, yet their integration into the field is often hindered by concerns over black-box unpredictability and a lack of replicability. This study introduces TACOMORE, a structured prompting framework designed to transform ad-hoc AI interactions into a standardized linguistic protocol. Built upon four foundational principles (Task, Context, Model, and Replicability), the framework guides LLMs to move beyond generic probability prediction to anchoring their reasoning in the specific co-occurrence patterns of a target corpus. We applied this framework to three core corpus tasks, i.e., the analysis of keywords, collocates, and concordances, using an open corpus of COVID-19 research abstracts. After testing three LLMs, we found that while structured prompting improves accuracy and replicability, inherent limitations regarding hallucination persist. This research offers a critical lens into the role of LLMs in corpus linguistics, highlighting their potential as complementary tools while emphasizing the irreplaceable role of human validation.

18.
arXiv (CS.LG) 2026-06-15

FreshRetailNet-LT: A Stockout-Annotated Censored Demand Dataset for Latent Demand Recovery and Forecasting in Fresh Retail

arXiv:2505.16319v4 Announce Type: replace Abstract: Accurate demand estimation is critical for the retail business in guiding the inventory and pricing policies of perishable products. However, it faces fundamental challenges from censored sales data during stockouts, where unobserved demand creates systemic policy biases. Existing datasets lack the temporal resolution and annotations needed to address this censoring effect. To fill this gap, we present FreshRetailNet-50K, the first large-scale benchmark for censored demand estimation. It comprises 50,000 store-product time series of detailed hourly sales data from 898 stores in 18 major cities, encompassing 863 perishable SKUs meticulously annotated for stockout events. The hourly stock status records unique to this dataset, combined with rich contextual covariates, including promotional discounts, precipitation, and temporal features, enable innovative research beyond existing solutions. We demonstrate one such use case of two-stage demand modeling: first, we reconstruct the latent demand during stockouts using precise hourly annotations. We then leverage the recovered demand to train robust demand forecasting models in the second stage. Experimental results show that this approach achieves a 2.73% improvement in prediction accuracy while reducing the systematic demand underestimation from 7.37% to near-zero bias. With unprecedented temporal granularity and comprehensive real-world information, FreshRetailNet-50K opens new research directions in demand imputation, perishable inventory optimization, and causal retail analytics. The unique annotation quality and scale of the dataset address long-standing limitations in retail AI, providing immediate solutions and a platform for future methodological innovation. The data (https://huggingface.co/datasets/Dingdong-Inc/FreshRetailNet-50K) and code (https://github.com/Dingdong-Inc/frn-50k-baseline}) are openly released.

19.
arXiv (CS.LG) 2026-06-15

Multi-Variable Stellar Parameter Estimation Using Residual Multitask Neural Networks

arXiv:2606.13868v1 Announce Type: cross Abstract: We present an end-to-end pipeline for estimating stellar parameters from Sloan Digital Sky Survey Data Release 12 spectra using a fully connected multitask neural network with residual blocks, whose hyperparameters are tuned via Bayesian optimization. The preprocessing pipeline includes per-spectrum standardization, RobustScaler normalization of the target variables – effective temperature $T_{\mathrm{eff}}$, metallicity $[\mathrm{Fe/H}]$, and surface gravity $\log g$ – and data augmentation via Gaussian noise injection. On a held-out test set, the model achieved Mean Absolute Errors (MAE) of $59.76~\mathrm{K}$ for $T_{\mathrm{eff}}$, $0.103~\mathrm{dex}$ for $[\mathrm{Fe/H}]$, and $0.130~\mathrm{dex}$ for $\log g$. Normalized against the full-scale range of each parameter, these results represent range-normalized errors between $1\%$ and $3\%$, achieved with a highly efficient model complexity of approximately 540,000 trainable parameters. These results demonstrate that a compact residual multitask architecture, combined with principled signal preprocessing, provides a parameter-efficient solution for nonlinear parameter estimation in large-scale spectral datasets. In particular, the proposed model achieves competitive performance with substantially lower complexity than deeper neural network baselines.

20.
arXiv (CS.CL) 2026-06-15

SciDef: Datasets and Tools for Automated Definition Extraction from Scientific Literature with LLMs

Scientific concepts are often defined inconsistently across papers, making it difficult to compare findings, reuse terminology, and build reliable downstream resources. We present SciDef, a resource suite for scientific definition extraction. The suite contains DefExtra, a benchmark of 268 human-validated author-stated definitions from 75 academic papers; DefSim, 60 human-labeled definition-pair similarity judgments; and an open LLM-based pipeline for PDF preprocessing, chunking, definition extraction, prompt optimization, and evaluation. We validate the resources by benchmarking 16 language models across prompting strategies and chunking schemes. The strongest set-level configuration achieves a score of 0.397, while the highest-coverage configuration matches at least one prediction to 86.4% of gold definitions but over-generates candidate definitions. We further show that an NLI-based matching metric agrees strongly with human DefSim judgments. These results position SciDef as a reusable benchmark and tooling layer for definition-centric literature analysis, while highlighting relevance-aware filtering as the key bottleneck for fully automatic definition extraction. Code & datasets are available at https://github.com/Media-Bias-Group/SciDef.

21.
arXiv (CS.AI) 2026-06-19

CTS-MoE: Implicit Terrain Adaptation via Mixture-of-Experts for Perceptive Locomotion

arXiv:2606.19633v1 Announce Type: cross Abstract: Perceptive legged locomotion over discontinuous terrain (e.g., stairs, gaps, and obstacles) requires adaptive behavior, as a single conservative gait cannot produce the anticipatory maneuvers needed for abrupt topology changes. Cast as multi-task reinforcement learning, this problem introduces a tension between sharing and separation. Tasks use a common locomotion base but have conflicting rewards, so a policy must share behavior while avoiding value interference. Prior work addresses only one side, with monolithic policies sacrificing specialization and hierarchical sub-policies sacrificing generalization across transitions and unseen terrain. We propose CTS-MoE, which combines a dense mixture-of-experts actor with perception-based gating to compose shared behaviors and a multi-critic with task-specific value heads to prevent interference. The model is trained end-to-end in a single-stage concurrent teacher-student setup that handles partial observability and avoids sequential distillation, with task labels used only during training. At deployment, routing depends solely on perception, allowing terrain adaptation without a high-level selector or terrain classifier. Experiments on a Unitree Go1 in simulation and on hardware across seen and unseen terrains show task-aware specialization, with lower tracking error and higher success rates than monolithic baselines. Project Website: https://cts-moe.github.io/ .

22.
arXiv (CS.CV) 2026-06-19

LEAP: Layer-skipping Efficiency via Adaptive Progression for Vision Transformer Distillation

Vision Foundation Models (VFMs) with Vision Transformer (ViT) backbones, such as DINOv2, have become essential for downstream tasks like object recognition and semantic segmentation. The immense computational requirements of backbones often necessitate distillation into smaller architectures for edge deployment. Feature-based knowledge distillation (KD) often suffers from the teacher-student gap; the student struggles to imitate teacher's complex feature map due to its limited capacity. To mitigate this bottleneck, we propose LEAP: Layer-skipping Efficiency via Adaptive Progression, a training curriculum for ViT feature-based knowledge distillation. By utilizing the teacher's intermediate feature maps as a sequence of progressively more difficult targets, our curriculum allows the student to build a foundational representation before tackling higher-level abstractions. Our results demonstrate that this paradigm significantly accelerates convergence through adaptive difficulty selection across various student model sizes and dataset scales. With our curriculum, the LEAP-distilled ViT-S achieves 90.1% accuracy on ImageNet-100, a +12.24% improvement compared with baseline. On ImageNet-1K, LEAP achieves +3.84% and +7.75% improvement for the instance retrieval task on the Oxford and Paris datasets, respectively. Furthermore, the curriculum enables 25.1% savings in training FLOPs and 21% savings in training time on ImageNet-100 by implementing early-stopping for teacher inference during the initial stages of training. Code is available at https://github.com/KevinZ0217/LEAP

23.
arXiv (CS.LG) 2026-06-18

FinP: Fairness-in-Privacy in Federated Learning by Addressing Disparities in Privacy Risk

arXiv:2502.17748v4 Announce Type: replace Abstract: Federated Learning (FL) inherently mitigates mass data centralization risks; however, its privacy protections are not equally distributed - leaving vulnerable individuals disproportionately exposed to sophisticated privacy attacks. Crucially, statistical heterogeneity in human-centric FL environments often results in an inequitable distribution of privacy risks, particularly affecting those whose sensitive attributes or behaviors make them outliers. To address this critical gap, we introduce FinP, a novel framework designed to formalize and enforce fairness-in-privacy by mitigating disproportionate client vulnerability to Source Inference Attacks (SIA). FinP operationalizes a two-pronged defense strategy that tackles both the symptoms and root causes of privacy disparity, ensuring that no group of clients bears an excessive privacy burden. It combines a server-side adaptive aggregation mechanism, which dynamically weights client contributions based on their estimated privacy risk, with a client-side regularization technique to curb localized overfitting that drives unique data memorization. Extensive empirical evaluations on FEMNIST, Human Activity Recognition (HAR), and CIFAR-10 datasets demonstrate that FinP effectively aligns privacy fairness with primary task utility. Notably, FinP successfully mitigates SIA risks and reduces disparities in privacy exposure, establishing that strong fairness-in-privacy guarantees need not compromise model utility. Ultimately, FinP establishes equitable privacy protections by reducing vulnerability disparities by up to 57.14%, while preserving global model utility within a marginal +/- 1.75% of standard federated baselines.

24.
arXiv (CS.LG) 2026-06-11

Triangular-Reference Schrödinger Bridges for Time Series Generation

arXiv:2605.27478v3 Announce Type: replace-cross Abstract: Schrödinger bridges for time series (SBTS) generate synthetic paths by projecting, in relative entropy, a Brownian reference onto the path laws that match the joint distribution of the data on the observation grid. The Brownian reference, however, fixes the quadratic variation of the generated paths, which is restrictive when stochastic volatility, correlated noise, or rank-deficient covariance structures must be reproduced. We introduce "Triangular-Reference Schrödinger Bridges for Time Series" (TR-SBTS), which keeps the entropy-projection backbone of SBTS but replaces the Brownian reference by a triangular, volatility-informed, intervalwise frozen reference on a state augmented with latent covariance descriptors. The construction remains a single entropy projection on the augmented state: the minimiser is the \(h\)-transform of the reference, and on each frozen interval the optimal drift has the logarithmic-gradient form \(b^\star(t,x)=A\,\nabla\log H(t,x)\), intrinsic to the active covariance directions when the frozen covariance \(A\) is degenerate. We prove stability of the frozen approximation and consistency of the associated regularised kernel estimators, describe a reference-aware Nadaraya–Watson implementation of the conditional next-increment law, and evaluate the construction on numerical experiments.

25.
arXiv (CS.CV) 2026-06-17

OpenTie: Open-vocabulary Sequential Rebar Tying System

Robotic practices on the construction site emerge as an attention-attracting manner owing to their capability of tackling complex challenges, especially in the rebar-involved scenarios. Most of existing products and research are mainly focused on the collection of large amounts of data with model training demands. To fulfill this gap, we propose OpenTie, a 3D training-free rebar tying framework utilizing a RGB-to-point-cloud generation and an open-vocabulary rebar detection on the real-world test. We implement the OpenTie via a robotic arm with a binocular camera and guarantee a high accuracy by applying the prompt-based object detection method on the image filtered by our proposed post-processing procedure for the image-to-point-cloud generation framework. Our pipeline requires no training efforts and outperforms the training-based object detection, i.e., YOLO-based method, with the verification on the real-world sequential rebar tying test. The system is flexible for horizontal and vertical rebar tying tasks and holds the potential application to the real construction site with possibility of commercialization.