Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
PLOS Medicine 2026-05-11

Connected or chained by social media? Child and adolescent mental health in a digital era

作者:

by Silja Kosola Social media has evolved from connection to compulsion, disproportionately harming children and adolescents. Addictive designs together with developmental vulnerability fuel mental health risks and highlight the urgent need for stricter age limits and stronger protections. In this Perspective, Silja Kosola outlines how social media disproportionately harms child and adolescent mental health, and argues that while recent policy changes aimed at protecting youth from social media are welcome, stricter age limits and greater accountability of social media companies are needed.

02.
arXiv (CS.LG) 2026-06-15

FreshRetailNet-LT: A Stockout-Annotated Censored Demand Dataset for Latent Demand Recovery and Forecasting in Fresh Retail

arXiv:2505.16319v4 Announce Type: replace Abstract: Accurate demand estimation is critical for the retail business in guiding the inventory and pricing policies of perishable products. However, it faces fundamental challenges from censored sales data during stockouts, where unobserved demand creates systemic policy biases. Existing datasets lack the temporal resolution and annotations needed to address this censoring effect. To fill this gap, we present FreshRetailNet-50K, the first large-scale benchmark for censored demand estimation. It comprises 50,000 store-product time series of detailed hourly sales data from 898 stores in 18 major cities, encompassing 863 perishable SKUs meticulously annotated for stockout events. The hourly stock status records unique to this dataset, combined with rich contextual covariates, including promotional discounts, precipitation, and temporal features, enable innovative research beyond existing solutions. We demonstrate one such use case of two-stage demand modeling: first, we reconstruct the latent demand during stockouts using precise hourly annotations. We then leverage the recovered demand to train robust demand forecasting models in the second stage. Experimental results show that this approach achieves a 2.73% improvement in prediction accuracy while reducing the systematic demand underestimation from 7.37% to near-zero bias. With unprecedented temporal granularity and comprehensive real-world information, FreshRetailNet-50K opens new research directions in demand imputation, perishable inventory optimization, and causal retail analytics. The unique annotation quality and scale of the dataset address long-standing limitations in retail AI, providing immediate solutions and a platform for future methodological innovation. The data (https://huggingface.co/datasets/Dingdong-Inc/FreshRetailNet-50K) and code (https://github.com/Dingdong-Inc/frn-50k-baseline}) are openly released.

03.
arXiv (CS.CV) 2026-06-16

WaveDINO: Learning-Based Atmospheric Correction of Unwrapped InSAR Interferograms Validated by GNSS: Results at Laguna del Maule and Campi Flegrei Volcanoes

Interferometric Synthetic Aperture Radar (InSAR) enables effective monitoring of volcanic deformation; however, the observed signals are often corrupted by atmospheric phase delays, seasonal surface changes, and decorrelation effects. Existing atmospheric correction methods, such as numerical weather model-based methods, can reduce these effects but do not consistently remove atmospheric artefacts and may introduce residual biases. To address these limitations, we propose a novel learning-based method for denoising unwrapped InSAR interferograms, using a hybrid training strategy that combines physically motivated synthetic deformation with real atmospheric noise. Specifically, we introduce WaveDINO, a wavelet-based multi-scale denoising framework conditioned on frozen DINOv3 foundation-model features and terrain information. Training uses synthetic magma-source deformation superimposed on short-term interferograms to expose the network to realistic atmospheric statistics while retaining known ground truth. Performance is evaluated on both controlled synthetic data and long-term real interferograms from Laguna del Maule (Chile) and Campi Flegrei (Italy), with independent GNSS measurements used for validation. WaveDINO consistently outperforms competing models, improving agreement with GNSS measurements, and reducing mean GNSS misfit by approximately 3% and 19% at two sites, respectively, while surpassing weather-model-based corrections.

04.
arXiv (CS.AI) 2026-06-17

Combating Data Laundering in LLM Training

arXiv:2604.01904v3 Announce Type: replace-cross Abstract: Post-hoc unauthorized-training data detection for large language models (LLMs) typically assumes a query-with-originals regime: rights holders query a target LLM with raw proprietary data and assess whether the model assigns them stronger memorization-based detection signals, e.g., higher confidence or lower loss, than held-out non-training reference texts. We show that this regime becomes brittle under data laundering, where the target LLM is trained on semantics-preserving but stylistically or structurally transformed surrogates of proprietary data to obfuscate provenance. Since training-time exposure occurs in the laundered form, memorization signals may no longer appear on the originals, collapsing the candidate-reference signal separation that standard detectors rely on. We counter this threat by studying laundering-aware detection with raw proprietary data, a held-out reference corpus, and query access to the target LLM, while the laundering transformation is undisclosed. Since exact recovery of the laundered corpus is infeasible, we infer a detection-useful synthesis process via an auxiliary LLM that maps originals into training-like queries. To make this search tractable, we introduce Synthesis Data Reversion (SDR), which constrains the unbounded space of natural-language transformations through a goal-details abstraction: a high-level transformation goal, e.g., "lyrical rewriting", and fine-grained details, e.g., "with vivid imagery". SDR identifies the most likely goal and iteratively refines details so synthesized queries elicit stronger target-model detection signals. Evaluated on the MIMIR benchmark against diverse laundering practices and target LLM families (Pythia, Llama2, and Falcon), SDR consistently restores detection signals, offering a practical auditing layer against data laundering.

05.
arXiv (CS.CL) 2026-06-11

Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models

While Text-to-Speech (TTS) systems enable emotional control via natural-language instructions, expressiveness, naturalness, and speech quality degrade when the target emotion conflicts with the textual semantics. We propose a Cross-modal Consistency Guided Classifier-Free Guidance (CCG-CFG) method with dynamic scales based on the degree of inconsistency between the text emotion and the explicit speech emotion, replacing the dropout condition with the text emotion. We also distill the CCG-CFG guidance signal using a hard-sample mining strategy, improving the TTS model's emotional alignment capability. Evaluations on five emotional corpora and two TTS benchmarks show that our approaches applied to CosyVoice2 achieve up to a 12% absolute improvement in emotion-recognition accuracy and a 10% relative improvement in subjective scores, outperforming baselines including HierSpeech++, Qwen3-TTS, and original CosyVoice2, while preserving intelligibility, naturalness, and high speech quality.

06.
medRxiv (Medicine) 2026-06-19

The Impact of Pregnant Womens Dietary Behavior on the Physiological Adaptation Paradox and Maternal-Fetal Resource Conflict in Conflict Settings: A Predictive Analytical Study

This scientific study aims to assess the level of awareness, nutritional knowledge, and actual behavioral practices among pregnant women in the Capital District of Sanaa, Republic of Yemen, and to determine their impact on the health and clinical indicators of the mother and fetus under complex conflict conditions. The study employed a descriptive-analytical approach based on a simple random sample of 200 pregnant women attending government-run hospitals and specialized medical centers in the Capital District. Field data were collected during December 2025 using a structured and validated questionnaire consisting of 42 items measuring demographic variables, awareness, practices, barriers, and health outcomes. The results of the statistical analysis using SPSS software showed a high level of nutritional awareness (87%) and healthy dietary practices (80%) among the sample participants. Simple and multiple linear regression tests revealed a statistically significant effect of awareness and practices in explaining 20.2% of the variance in the health status of the mother and fetus (R{superscript 2}= 0.204, p < 0.001). The study demonstrated that actual behavioral practices have greater predictive power ({beta}=0.316, p=0.001) compared to theoretical cognitive awareness ({beta}=0.232, p=0.005) in determining clinical outcomes for the mother and fetus, highlighting the widening gap between knowledge and behavior under structural pressures. "Morning sickness" (80%) and the deterioration of "family economic status" (71%) emerged as the greatest physiological and material barriers to proper nutrition. With their inferential impact established as an extension of the maternal-fetal resource allocation conflict in a physiologically and economically challenging environment, the study also identified significant differences in nutritional behavior and health outcomes in favor of housewives and mothers who are more educated and have higher incomes, while no significant differences were recorded attributable to obstetric variables such as stage or order of pregnancy. The study offers a unique theoretical and practical contribution by formulating an integrated causal model that demonstrates that the fetus acts as a biological drain on the mothers cellular and mineral reserves in a war environment, which necessitates directing antenatal care and support programs toward effective behavioral empowerment and nutritional support to overcome the structural and material barriers faced by pregnant women.

07.
arXiv (CS.CL) 2026-06-11

FlowBank: Query-Adaptive Agentic Workflows Optimization through Precompute-and-Reuse

Large Language Model (LLM)-based multi-agent systems are increasingly powerful, but current agentic workflow optimization paradigms make an unsatisfying trade-off. Task-level methods spend substantial offline compute yet deploy only a single workflow, leaving complementary candidates unused, while query-level methods synthesize a new workflow per query at substantial inference cost. Our motivating analysis shows these paradigms are more complementary than competing: workflows discovered during offline search often solve different subsets of queries, and many queries handled by expensive query-level generation can already be solved by cheaper precomputed workflows. This suggests a different objective: rather than searching for one universally best workflow or regenerating one per instance, we should build a compact bank of reusable, complementary workflows and select among them adaptively at inference time. Doing so requires solving three coupled problems: generating complementary rather than redundant candidates, compressing them into a small deployable portfolio, and assigning each query to the right workflow under a performance-cost trade-off. To this end, we present FlowBank, a three-stage framework for portfolio-based agentic workflow optimization. Diversifying proposes DiverseFlow to steer search toward under-covered queries and produce a high-coverage candidate pool. Curating proposes CuraFlow to compress this pool into a compact portfolio with minimal redundancy. Matching casts deployment as edge-value prediction on a query-workflow bipartite graph and routes each incoming query to the portfolio member with the best predicted utility. Across five benchmarks, FlowBank achieves the highest average score among the evaluated methods while remaining cost-competitive, improving over the strongest automated and handcrafted baselines by 4.26% and 14.92% relative, respectively.

08.
arXiv (CS.LG) 2026-06-16

GradPower: Powering Gradients for Faster Language Model Pre-Training

arXiv:2505.24275v4 Announce Type: replace Abstract: We propose GradPower, a lightweight gradient-transformation technique for accelerating language model pre-training. Given a gradient vector $g=(g_i)_i$, GradPower first applies the elementwise sign-power transformation: $\varphi_p(g)=(sign(g_i)|g_i|^p)_{i}$ for a fixed $p>0$, and then feeds the transformed gradient into a base optimizer. Notably, GradPower requires only a single-line code change and no modifications to the base optimizer's internal logic, including the hyperparameters. When applied to Adam (termed AdamPower), GradPower consistently achieves lower terminal loss across diverse architectures (LLaMA, Qwen2MoE), parameter scales (66M to 2B), datasets (C4, OpenWebText), and learning-rate schedules (cosine, warmup-stable-decay). The most pronounced gains are observed when training modern mixture-of-experts models with warmup-stable-decay schedules. GradPower also integrates seamlessly with other state-of-the-art optimizers, such as Muon, yielding further improvements. Finally, we provide theoretical analyses that reveal the underlying mechanism of GradPower and highlight the influence of gradient noise.

09.
medRxiv (Medicine) 2026-06-16

Language fMRI lateralization success and head motion in pediatric epilepsy patients with ADHD, and improvements based on fMRI task training

Introduction Language functional MRI (fMRI) is a valuable tool for presurgical planning in epilepsy. Functional MRI can be challenging in children, and head motion can compromise its utility. The candidacy of patients with ADHD for fMRI is sometimes queried regarding concerns about possible head motion. In 2020, we implemented an fMRI task training program, via telehealth and/or mock MRI. We aimed to determine whether training increased language lateralisation success and/or reduced head motion in all patients, and in those with ADHD. We also aimed to determine whether patients with ADHD exhibited more head motion during fMRI than those without ADHD. Methods We retrospectively identified 223 epilepsy (85%) and other neurosurgery patients, (241 scans including repeats) with language fMRI at Royal Children's Hospital, Melbourne, Australia, 2016-2024. There were 24 individuals with ADHD listed in the Electronic Medical Record, five of whom had diagnoses of both ADHD and autism; and nine with autism. Language lateralisation success was determined by clinician description recorded as left/right/bilateral in the medical record. 99 patients were provided the training including fMRI task practise. Head motion was quantified by maximum Framewise Displacement (FDmax; mm). Results ADHD was associated with lower language lateralisation success. Training was associated with greater language lateralisation success, across all patients, and in those with ADHD. Regarding ADHD and head motion, outliers in FDmax were seen in 5 young patients with ADHD. Data were trimmed to allow separate investigation of FDmax for the sample with and without extremes of head motion. In untrimmed data, FDmax was significantly higher in patients with ADHD than in those without. In trimmed data, FDmax was on average lower in patients with ADHD than those without, however this was not statistically supported. Regarding training and head motion, across all patients, FDmax was significantly lower for scans with training than without. In patients with ADHD, FDmax was on average lower for scans with training, however training was not associated with FDmax. Conclusions Language fMRI training was associated with higher language lateralization success, particularly in patients with ADHD. Training was associated with reduced head motion across all patients. Although some young patients with ADHD had substantial head motion, most in our sample did not move more than those without ADHD. We conclude that the training program increases success of language fMRI, and that an ADHD diagnosis should not be a contraindication to language fMRI.

10.
medRxiv (Medicine) 2026-06-24

Durability and Seasonal Variation in the Effectiveness of Nirsevimab over Three Seasons in Connecticut

Background Nirsevimab has been widely administered in the United States since 2023 to protect infants and young children from severe disease caused by respiratory syncytial virus (RSV). Although early post-licensure studies have shown high effectiveness against medically attended RSV infection, uncertainty remains about the durability of protection, effectiveness beyond the first RSV season, and the extent to which changing RSV seasonality influences real-world effectiveness. Objective To estimate the effectiveness of nirsevimab against medically attended RSV infection across three consecutive RSV seasons and to examine how effectiveness varies by season and time since immunization. Methods We conducted a test-negative case-control study utilizing electronic health records of infants and young children tested for RSV by polymerase chain reaction in outpatient and inpatient settings within the Yale New Haven Health System between October 1, 2023, and March 1, 2026. Effectiveness of nirsevimab was estimated using multivariable logistic regression, adjusting for age, weekly RSV activity, pre-existing risk factors, and other potential confounders. Variation in effectiveness was examined by season, encounter setting, and time since immunization up to 24 months. Results Overall, 17,755 infants and young children were tested for RSV infection, of whom 2,388 (13.4%) were cases and 15,367 (86.6%) were controls. The overall effectiveness of nirsevimab was 67.3% (95% confidence interval [CI]: 59.8, 73.3%) against all medically-attended RSV infections, 60.2% (95% CI: 49.6, 68.5%) against RSV-associated outpatient visits, and 88.9% (95% CI: 82.3, 93.0%) against RSV-associated hospitalization. Effectiveness against medically attended RSV infection declined across seasons, from 76.7% (95% CI: 60.5, 86.3%) in 2023/24 to 54.4% (95% CI: 33.0, 68.9%) in 2025/26. Lower season-specific effectiveness in later seasons corresponded with progressively delayed RSV activity over. Protection against RSV-associated hospitalization declined with increasing time since immunization, from 92.5% (95% credible interval [CrI]: 85.9, 96.4%) at 1 month, to 77.2% (95% CrI: 60.4, 87.6%) at 6 months, and 39.9% (95% CrI: 2.4, 63.3%) at 12 months post-immunization, after which effectiveness plateaued. Conclusions Nirsevimab remained effective against RSV-associated hospitalization through 6 to 12 months after immunization. Delayed RSV activity was associated with lower effectiveness, highlighting the importance of aligning administration with local RSV circulation.

11.
arXiv (CS.LG) 2026-06-25

Sample complexity of unbalanced entropic OT

arXiv:2606.24987v1 Announce Type: cross Abstract: Optimal transport (OT) has become a central language for comparing probability measures, but exact balanced OT is often both too rigid for data with missing, created, or destroyed mass and subject to unfavorable high-dimensional sample complexity. Entropic regularization and unbalanced relaxations address these limitations in complementary ways. Entropy smooths the geometry, improves statistical behavior, and enables fast Sinkhorn-type algorithms, while unbalanced marginal penalties replace hard conservation constraints by divergence terms adapted to noisy empirical data. This paper studies the sample complexity of entropic unbalanced OT at the level of the optimal coupling, rather than only the scalar transport value. We develop a translation-invariant dual formulation, prove compactness and strong convexity properties for the intrinsic dual variables, and convert these geometric estimates into high-probability finite-sample bounds for empirical couplings. The results clarify why regularization is a practical necessity in machine learning applications: it softens the curse of dimensionality, reduces the number of samples needed for stable transport estimation, and keeps the resulting estimators compatible with scalable Sinkhorn-type solvers.

12.
arXiv (quant-ph) 2026-06-25

Recursive QLSTM with Dynamic Variational Quantum Circuit Adaptation

arXiv:2606.24932v1 Announce Type: new Abstract: Recent advances in quantum computing and machine learning have motivated the development of quantum models for sequential data processing. In this paper, we propose a Recursive Quantum Long Short-Term Memory model, or Recursive QLSTM, which extends QLSTM through metacore-based recursive constructions. We numerically test the model under different input sequence lengths, metacore designs, and recursive rules, and identify the best-performing architecture among these variants. For this selected model, we further provide theoretical arguments explaining why its recursive structure improves temporal information propagation and enhances learning performance. Our results suggest that Recursive QLSTM offers a flexible and effective framework for quantum recurrent learning over input time series of various lengths.

13.
arXiv (CS.LG) 2026-06-19

Representing Piecewise-Linear Functions by Functions with Minimal Arity

arXiv:2406.02421v2 Announce Type: replace-cross Abstract: Any continuous piecewise-linear function $F\colon \mathbb{R}^{n}\to \mathbb{R}$ can be represented as a linear combination of $\max$ functions of at most $n+1$ affine-linear functions. In our previous paper [``Representing piecewise linear functions by functions with small arity'', AAECC, 2023], we showed that this upper bound of $n+1$ arguments is tight. In the present paper, we extend this result by establishing a correspondence between the function $F$ and the minimal number of arguments that are needed in any such decomposition. We show that the tessellation of the input space $\mathbb{R}^{n}$ induced by the function $F$ has a direct connection to the number of arguments in the $\max$ functions.

14.
PLOS Computational Biology 2026-06-18

scMagnifier: Resolving fine-grained cell subtypes via GRN-informed perturbations and consensus clustering

作者:

by Zhenhui He, Dong Kangning Resolving fine-grained cell subtypes in single-cell RNA sequencing (scRNA-seq) data remains challenging, as their subtle transcriptional differences are often obscured by technical noise and data sparsity. Here, we present scMagnifier, a consensus clustering framework that leverages gene regulatory network (GRN)-informed in silico perturbations to amplify subtle transcriptional differences and uncover latent cell subpopulations. scMagnifier perturbs candidate transcription factors (TFs), propagates perturbation effects through cluster-specific GRNs to simulate post-perturbation expression profiles, and integrates clustering results across multiple perturbations into stable subtype assignments. Additionally, scMagnifier introduces regulatory perturbation consensus UMAP (rpcUMAP), a perturbation-aware visualization that provides clearer separation between cell subtypes and guides the selection of the optimal number of clusters. In both single-batch and multi-batch benchmarks, scMagnifier consistently improves the resolution and accuracy of fine-grained cell type identification. Notably, when integrated with spatial clustering methods such as STAGATE, scMagnifier is compatible with spatial transcriptomics workflows and effectively reveals tumor cell subtypes and their spatial organization in ovarian cancer.

15.
arXiv (CS.LG) 2026-06-19

Quantile of Means: A Bonus-Free Ensemble Method for Minimax Optimal Reinforcement Learning

arXiv:2606.20107v1 Announce Type: new Abstract: Optimal Reinforcement Learning (RL) algorithms typically rely on carefully constructed count-based uncertainty estimates to drive exploration. Although theoretically sound, such estimates are hard to compute in practical settings and therefore offer limited insight for designing exploration heuristics. Meanwhile, ensembling has emerged as a practical approach, but remains without theoretical justification. Building on a recent ensemble-based method for Multi-Armed Bandits, we propose a quantile-based ensemble method for finite-horizon Markov Decision Processes (MDPs). Our simple count-free approach achieves optimal variance-dependent regret bounds, providing theoretical grounding for ensemble-based exploration in RL.

16.
arXiv (CS.CV) 2026-06-25

Benchmarking the Alignment of Data-Quality Metrics, Human Judgment and Land-Cover Segmentation Performance for Earth Observation

Volume and quality of datasets are crucial for deep learning model training, yet they are often constrained by availability and data acquisition costs. Synthetic data augmentation can extend existing datasets with realistic images, and the quality of these images is generally assessed through fidelity metrics such as FID, KID, IS, LPIPS and SSIM that measure structural or distributional similarity. However, such metrics, including the widely used FID, focus on visual fidelity without reflecting downstream utility, and can diverge from human perception under perturbations that are imperceptible to human observers. In this work, we systematically evaluate Earth observation datasets alongside synthetic counterparts generated by deep generative models, comparing automatic metrics against human perception and downstream tasks. Our results reveal a stark misalignment: semantics-preserving perturbations such as rotation drastically alter metric scores while leaving human recognition unaffected, and synthetic samples that score poorly on automatic metrics achieve comparable or higher perceived realism, and can improve downstream performance when combined with real data. By benchmarking semantic segmentation models trained on mixed real-synthetic datasets, we demonstrate that quality metrics rooted in ImageNet-pretrained feature spaces are unreliable indicators for geospatial data. Our findings underscore that automatic quality evaluation of synthetic datasets should be grounded in downstream task performance and human evaluation.

17.
arXiv (quant-ph) 2026-06-19

Quantum Entanglement Degree, Mean Positronium Lifetime, and the $3\gamma$/$2\gamma$ Annihilation-Rate Ratio as Novel PET Biomarkers for Hypoxia – Concept, Challenges, and Predictions

作者:

arXiv:2605.00021v3 Announce Type: replace-cross Abstract: This manuscript introduces a novel method to assess tissue oxygen concentration via the quantum entanglement (QE) of photons originating from positronium which is produced within the patient's body during positron emission tomography. We also investigate the possibility of assessing hypoxia by simultaneously detecting positronium lifetime and the positronium decay rate ratio. We introduce two distinct quantum sensing approaches. Method 1 utilizes the correlation between oxygen concentration and ortho-positronium (o-Ps) decay rates, relying on the simultaneous measurement of the mean o-Ps lifetime ($\tau_{\mathrm{oPs}}$) and the $3\gamma$-to-$2\gamma$ annihilation rate ratio of o-Ps ($R_{\mathrm{oPs-3\gamma/2\gamma}}$). Method 2 introduces a novel hypothesis: that the degree of QE is sensitive to the relative contribution of annihilation mechanisms (pick-off vs. conversion), which in turn depends on oxygen concentration. We derive a formula for partial pressure of oxygen ($p\mathrm{O}_2$) as a function of $R_{\mathrm{oPs-3\gamma/2\gamma}}$ and $\tau_{\mathrm{oPs}}$ and estimate the measurement accuracy required for these parameters - and for the degree of QE - to sense in-vivo oxygen pressure in the range between hypoxic and physoxic conditions. Theoretical models and quantitative estimates for $R_{\mathrm{oPs-3\gamma/2\gamma}}$, $\tau_{\mathrm{oPs}}$ and for the degree of QE ($C_{\mathrm{QE}}$ ) as a function of $p\mathrm{O}_2$ are provided for water, isopropanol, cyclohexane, isooctane, and adipose tissue. In particular, applying the formulas derived under the working hypothesis that in pick-off process the photons are not entangled, we estimated that for $p\mathrm{O}_2 = 0$, the degree of quantum entanglement $C_{\mathrm{QE}}$ is equal to 0.890 for adipose, 0.886 for isopropanol, 0.867 for water, 0.818 for cyclohexane, and 0.784 for isooctane.

18.
arXiv (CS.CL) 2026-06-18

Lost in a Single Vector: Improving Long-Document Retrieval with Chunk Evidence Aggregation

Dense retrieval ranks one query vector against one document vector. On long documents, this interface can fail when a short but decisive span is weakened during document encoding before ranking. We study this failure mode as document-side early compression and introduce the Evidence Dilution Index (EDI) to measure how far a document-level representation falls below the strongest chunk-level evidence within the same gold document. Guided by this view, we propose DICE (Document Inference via Chunk Evidence), a training-free document-side strategy that splits documents into chunks, encodes them independently with a frozen model, and aggregates them back into a single vector while preserving the standard one-query-one-document interface. On LongEmbed, DICE improves retrieval across four backbones, with the largest gains on slices beyond 4k tokens: for Dream, Passkey >4k rises from 30.0 to 90.0 and Needle >4k from 23.3 to 74.0. Across 12,779 filtered samples, DICE yields lower EDI than the single-vector baseline in 92.8% of cases. These results establish document-level encoding as a practical and underexplored lever for long-document retrieval.

19.
arXiv (CS.LG) 2026-06-24

A Fast and Effective Method for Euclidean Anticlustering: The Assignment-Based-Anticlustering Algorithm

arXiv:2601.06351v2 Announce Type: replace Abstract: Anticlustering is an NP-hard combinatorial optimization problem that consists of partitioning a set of objects into equal-sized groups called anticlusters such that the objects in the same anticluster are as dissimilar as possible and thereby representative of the entire set of objects. Here we study the case where the dissimilarity metric is the squared Euclidean distance between the respective feature vectors. Applications of Euclidean anticlustering include social studies, cross-validation, creating mini-batches for stochastic gradient descent, and finding balanced K-cut partitions. In particular, machine-learning applications such as mini-batch generation involve million-scale datasets and very large values of K, making scalable anticlustering algorithms essential. We propose a new algorithm, the Assignment-Based Anticlustering (ABA) algorithm, that scales to instances with millions of objects and hundreds of thousands of anticlusters within seconds to minutes, which is far beyond what existing anticlustering methods can manage. We demonstrate here, via an extensive computational study, that our algorithm outperforms existing anticlustering methods in both solution quality and running time. This is so also for anticlustering with categories. For the related problem of balanced K-cut partitioning, our algorithm is superior to the well-known METIS method. The code of our algorithm is available on GitHub.

20.
arXiv (math.PR) 2026-06-24

Sim-to-Real Betting on the E-Process: Bringing "simulators" to anytime-valid confidence sequences

arXiv:2606.24038v1 Announce Type: cross Abstract: This note describes an integration of the sim-to-real performance estimate with betting (from Chen et al.) and the safe anytime-valid inference (from Ramdas et al.). Using the scaled simulators. The method produces efficient, reliable certificates for the mean estimate, an approach that is especially valuable in robot performance testing. This note gives a primary, self-contained account of the construction; preliminaries of the respective methods are kept at a minimum, and one shall refer to the original works for full detail. Some synthetic examples demonstrating the proposed algorithm can be found at https://github.com/ISUSAIL/Bet4Sim2Real-EProcess.

21.
arXiv (CS.CL) 2026-06-24

Mind the Heads: Topological Representation Alignment for Multimodal LLMs

Representation alignment has emerged as an effective approach to improve Multimodal Large Language Models (MLLMs) by regularizing their internal representations toward those of an external vision encoder. However, existing methods typically align a fixed layer of the language backbone, overlooking the fine-grained structure of Transformer models. In this work, we propose Head-Wise Representation Alignment (HeRA), a method that enforces cross-modal alignment at the level of individual attention heads. Our approach is grounded in the Platonic Representation Hypothesis, focusing on preserving the topological structure of representations (i.e., their local neighborhood relationships) across modalities. Following the Mutual K-Nearest Neighbor (MKNN) alignment metric, we introduce a contrastive objective that acts as a differentiable proxy for matching local structures. HeRA applies this objective during multimodal training to specific attention heads in the LLM, selected by their alignment score according to the MKNN metric. Counterintuitively, we find that aligning the least aligned heads yields the largest gains. Extensive evaluations across multiple MLLMs and 18 benchmarks demonstrate that HeRA consistently improves performance on challenging vision-centric tasks and serves as an effective regularizer against visual hallucinations by naturally curbing the over-reliance on linguistic priors. Our code is publicly released.

22.
arXiv (CS.LG) 2026-06-16

Exploding and vanishing gradients in deep neural networks: the effect of residual connections

arXiv:2606.17013v1 Announce Type: cross Abstract: The well known phenomenon of exploding and vanishing gradients in deep neural networks is analyzed using multiplicative ergodic theory. The effect of adding a residual connection is explained in this context. Specifically, a characterization of Liapunov exponents due to Furstenberg and Kifer is exploited in order to make a precise statement about the Liapunov spectrum and the effect of residual connections on it.

23.
arXiv (CS.AI) 2026-06-16

FlowState: Sampling-Rate-Equivariant Time-Series Forecasting

arXiv:2508.05287v3 Announce Type: replace-cross Abstract: Existing time series foundation models (TSFMs), often based on transformer variants, lack adaptability to different sampling rates, struggle with generalization across varying context and target lengths, and are computationally inefficient. We introduce FlowState, a novel TSFM architecture that achieves sampling-rate-equivariant forecasting through a unified design that pairs a state space model (SSM) encoder with a functional basis decoder (FBD). This design enables continuous-time modeling and dynamic time-scale adjustment, allowing FlowState to inherently generalize across all possible temporal resolutions, and dynamically adjust the forecasting horizons without retraining. We further propose an efficient pretraining strategy that improves robustness and accelerates training. Despite being one of the smallest TSFMs, FlowState achieves state-of-the-art results on the widely used GIFT-Eval benchmark, while demonstrating superior adaptability to unseen sampling rates. Our detailed analyses confirm the effectiveness of its components, and we demonstrate its unique ability to adapt to varying input sampling rates.

24.
arXiv (CS.AI) 2026-06-16

A Security Analysis of Long-Horizon Agentic AI Systems: Threats, Evaluation, and Framework Development

arXiv:2606.14816v1 Announce Type: cross Abstract: This paper presents a structured analysis of security challenges in long-horizon agentic AI systems. The study reviews existing threats, evaluation approaches, attack propagation mechanisms, and security frameworks. A taxonomy of security threats and a framework for analyzing attack propagation are proposed to support future research in agentic AI security

25.
arXiv (CS.CL) 2026-06-16

Transfer Learning for FHIR Questionnaire Terminology Binding

Electronic prior authorization workflows require FHIR Questionnaire items to carry LOINC codes, yet most items in the HL7 Da Vinci CDS-Library lack these bindings. We treat this as a retrieval problem: given a Questionnaire item's text, find the correct LOINC code in a pool of 97,314 active codes. We compare six methods (TF-IDF, frozen MiniLM, BioBERT, BioLORD, contrastively fine-tuned MiniLM, and a TF-IDF+GPT reranker) on a 54-item evaluation set spanning three query styles (natural question, medium, and terse). No single method wins on every metric. BioLORD, a frozen encoder pre-trained on biomedical ontology definitions, has the best top-rank accuracy (R@1 = 0.185, MRR = 0.246) despite seeing no task-specific data, while a contrastive fine-tune on raw LHC-Forms pairs takes R@5 (0.389) and R@10 (0.426). A distribution-shift ablation shows why the fine-tune in our main table is not the strongest one: adding GPT-generated paraphrases to the raw pairs drops R@5 from 0.389 to 0.296, so the augmented union underperforms raw-only training on every metric except R@1. Performance peaks at 5k training pairs. Error analysis on BioLORD's R@1 failures shows that wrong-specificity and ambiguous-text cases together account for 59% of errors.