Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CL) 2026-06-16

Modeling Sarcastic Speech: Semantic and Prosodic Cues in a Speech Synthesis Framework

Sarcasm is a pragmatic phenomenon in which speakers convey meanings that diverge from literal content, relying on an interaction between semantics and prosodic expression. However, how these cues jointly contribute to the recognition of sarcasm remains poorly understood. We propose a computational framework that models sarcasm as the integration of semantic interpretation and prosodic realization. Semantic cues are derived from an LLaMA 3 model fine-tuned to capture discourse-level markers of sarcastic intent, while prosodic cues are extracted through semantically aligned utterances drawn from a database of sarcastic speech, providing prosodic exemplars of sarcastic delivery. Using a speech synthesis testbed, perceptual evaluations show that semantic and prosodic cues enhance perceived sarcasm, with the combined system achieving the best downstream F1 while maintaining high subjective sarcasm ratings. These findings highlight the complementary roles of semantics and prosody in pragmatic interpretation and illustrate how modeling can shed light on the mechanisms underlying sarcastic communication.

02.
arXiv (quant-ph) 2026-06-15

Sensitivity of polaron-molecule observables to MDR/GUP-like ultraviolet deformations at low energies via quantum computing

arXiv:2606.14479v1 Announce Type: new Abstract: We show that impurity many-body observables can display enhanced sensitivity to ultraviolet deformations of generalized-uncertainty-principle and modified-dispersion-relation type at accessible energy scales. Using a deformed polaron-molecule Hamiltonian constructed to preserve the infrared sector, we quantify the impact of such deformations on spectral and Ramsey observables and implement the corresponding dynamics in a controlled quantum computing setting. We identify regimes near the polaron-molecule crossover where small ultraviolet deformations are strongly amplified, leading to experimentally resolvable changes in quasiparticle properties and spectral response. Our results establish a concrete sensitivity-based route to low-energy quantum-gravity phenomenology in a well-defined many-body platform and delimit the validity of the effective description. Furthermore, we report experimental validation on the QRed superconducting quantum processor (BSC-CNS).

03.
arXiv (CS.LG) 2026-06-12

Disparate Impact in Synthetic Data Generation

arXiv:2606.13105v1 Announce Type: new Abstract: We revisit the fairness notion of disparate impact for synthetic data generation (SDG), that assesses whether the utility of generated records is the same across sensitive groups. Our approach departs from existing work on fair SDG, that address the problem of correcting for undue biases in the observed distribution, hence redefining SDG as learning a distribution that is not that of the real data. By contrast, non-disparate impact is notably achieved when the synthetic and real distributions are the same. We expose reasons why SDG may fail to reach that solution and discuss why approximation and estimation errors occur and can be disparate across groups. We notably look into the expressive power of SDG methods relative to distribution complexity, sampling errors due to group proportions, and estimation errors induced by differential privacy mechanisms. We illustrate cases of disparate impact on both artificial and real-world data, focusing on SDG methods that rely on probabilistic graphical models. We also introduce a strategy of learning group-wise SDG models and illustrate how it can improve both the overall utility and its parity in many settings.

04.
arXiv (CS.LG) 2026-06-19

When to Trust, How to Distill: Multi-Foundation Model Guidance for Lightweight, Robust Scientific Time Series Forecasting

arXiv:2606.19363v1 Announce Type: new Abstract: The deployment of Time-Series Foundation Models (TSFMs) in physical sciences is hindered by a critical trade-off: while these models encode rich, universal temporal dynamics, they suffer from severe distributional misalignment when applied zero-shot to specific scientific domains, and their computational cost prohibits deployment in edge-computing sensor networks. We address a fundamental challenge: How can we extract latent structural knowledge from misaligned foundation models (FM) to train lightweight, specialized forecasters? We propose Gated Uncertainty-Aware Routing for Distillation (Guard), a novel framework that reframes multiteacher distillation as an instance-wise decision process with two adaptive mechanisms: (1) a Contextual Router that dynamically selects the most relevant teacher based on local input statistics, exploiting complementarity across diverse foundation models; and (2) an Uncertainty-Gated Temperature mechanism that acts as a "circuit-breaker," automatically attenuating distillation strength when teacher confidence diverges from domain reality. We evaluate our proposed lightweight framework on four climate-critical domains: meteorology, ecosystem carbon flux, soil moisture, and energy grids. Our method significantly reduces RMSE relative to a fixed-weight multi-teacher distillation baseline, successfully distilling knowledge from pretrained FMs (teachers) even when they exhibit suboptimal zero-shot accuracy due to distribution shift between the original and target data domains. We demonstrate that these domain-misaligned teachers can still serve as critical correctives, outperforming the globally superior FMs on 28.5% of the hardest instances. Ultimately, this enables high-precision scientific forecasting suitable for resource-constrained edge deployment. Code is available at https://github.com/RupasreeDey/GUARD-KDD2026.

05.
arXiv (quant-ph) 2026-06-19

Simulation of Non-Markovian Quantum Accelerated Dynamics via Time-Fractional Schrödinger Equation

arXiv:2606.20024v1 Announce Type: new Abstract: The Time-Fractional Schrödinger Equation (TFSE) is an effective tool for simulating the dynamics of non-Markovian quantum systems. The Quantum Speed Limit (QSL) time characterizes the minimum time required for the evolution of a non-Markovian quantum system. In this paper, Wei's TFSE is employed to simulate the non-Markovian quantum accelerated evolution process in the Resonant Dissipative Jaynes-Cummings (RDJC) model. By solving the QSL time of a time-fractional single-qubit open system, the enhancement mechanism of the system evolution speed induced by the non-Markovian memory effects of the environment is revealed. Further studies show that the optimized acceleration of the system evolution can be achieved by jointly regulating the fractional order, coupling strength, and photon number. Comparative analyses indicate that Wei's TFSE can accurately capture the non-Markovian accelerated dynamical features of the system over the entire fractional order range, whereas Naber's TFSE is applicable only within a limited fractional order interval. In addition, the comparisons of the average simulation time for calculating the dynamical trajectory of the excited-state probability demonstrate that Wei's TFSE has a significant simulation advantage in computational efficiency. Therefore, Wei's TFSE is more accurate and efficient for simulating the accelerated dynamics of non-Markovian quantum systems.

06.
arXiv (CS.AI) 2026-06-16

S-SPPO: Semantic-Calibrated Self-Play Preference Optimization

arXiv:2606.01561v2 Announce Type: replace Abstract: Aligning Large Language Models (LLMs) with human preferences is often formulated via Direct Preference Optimization (DPO). However, the standard Bradley-Terry instantiation of DPO is limited in modeling common departures from transitivity in human preferences. To address this, recent work has introduced Self-Play Preference Optimization (SPPO), which iteratively refines the policy by training on self-generated win-lose pairs. Our investigation, however, reveals a critical instability in SPPO: the optimization is prone to policy degeneration when the preference oracle assigns overly confident wins to semantically indistinguishable responses. To mitigate this, we propose S-SPPO, a dual-space semantic calibration framework comprising: i) Supervision Calibration via semantic gating, which anneals win rate targets toward the maximum-entropy baseline as semantic overlap increases; and ii) Representation Calibration via latent repulsion to enforce geometric diversity to prevent manifold collapse and maintain latent diversity between chosen and rejected samples. Theoretically, we show that the calibration preserves the constant-sum game structure, facilitating convergence to a Nash Equilibrium. Empirically, S-SPPO avoids the performance degradation seen in prior methods, achieving 52.19% win rate and 47.46% length-controlled win rate on AlpacaEval 2.0 with Llama-3-8B, without using additional human-annotated preferences during training. The code will be available at https://github.com/xiwenc1/s-sppo.

07.
arXiv (CS.CL) 2026-06-19

CREDENCE: Claim Reduction for Decomposition & Enhanced Credibility – Semantic Metrics and Convergence Analysis

Decomposing compound sentences into atomic, verifiable claims is a prerequisite for reliable automated fact-checking. Prior work has relied on token-overlap (Jaccard) metrics that systematically underestimate decomposition quality for paraphrastic claims, and has lacked formal termination analysis for the repair loop. We present Credence, a revised claim decomposition and evaluation framework addressing both shortcomings. Our contributions are: (1) Semantic-F1: we use BGE-large cosine similarity fidelity metric that resolves Jaccard's penalisation and improves downstream fact-checking accuracy; (2) Convergence theorems: we formally characterise four properties of the repair pipeline, establishing that rule-based repair is monotone and finitely terminating under an oracle parser assumption; LLM-based self-repair is provably non-monotone and requires an early-exit guard; (3) Three evaluation benchmarks spanning social-media, encyclopaedic, and news domains for cross-domain generalisation measurement; (4) Multi-model benchmarking across four decomposer models (3.8B-12B) and a closed API model. Experiments on SocialClaimSplit, WikiSplitBench, and ClaimDecompBench show that Semantic-F1 outperforms Jaccard-F1 by +15-32pp. EPR ranges from 0.94 to 1.00 on SocialClaimSplit and WikiSplitBench, while ClaimDecompBench includes lower base EPR cases (down to 0.824) due to harder news-domain constructions, and rule-repair reduces the Atomicity Violation Rate (AVR) by 47-100% relative to the base model without degrading fidelity.

08.
arXiv (CS.AI) 2026-06-16

Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments

arXiv:2505.19699v2 Announce Type: replace-cross Abstract: Federated Learning (FL) is a decentralized machine learning paradigm that enables clients to collaboratively train models while preserving data privacy. However, the coexistence of model and data heterogeneity gives rise to inconsistent representations and divergent optimization dynamics across clients, ultimately hindering robust global performance. To transcend these challenges, we propose Mosaic, a novel data-free knowledge distillation framework tailored for heterogeneous distributed environments. Mosaic first trains local generative models to approximate each client's personalized distribution, enabling synthetic data generation that safeguards privacy through strict separation from real data. Subsequently, Mosaic forms a Mixture-of-Experts (MoE) from client models based on their specialized knowledge, and distills it into a global model using the generated data. To further enhance the MoE architecture, Mosaic integrates expert predictions via a lightweight meta model trained on a few representative prototypes. Extensive experiments on standard image and multimodal benchmarks demonstrate that Mosaic consistently outperforms state-of-the-art approaches under both model and data heterogeneity. The source code has been published at https://github.com/Wings-Of-Disaster/Mosaic.

09.
arXiv (CS.CV) 2026-06-18

VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset

Human head detection, keypoint estimation, and 3D head model fitting are essential tasks with many applications. However, traditional real-world datasets often suffer from bias, privacy, and ethical concerns, and they have been recorded in laboratory environments, which makes it difficult for trained models to generalize. Here, we introduce \method – a large-scale synthetic dataset generated with diffusion models for human head detection and 3D mesh estimation. Our dataset comprises over 1 million high-resolution images, each annotated with detailed 3D head meshes, facial landmarks, and bounding boxes. Using this dataset, we introduce a new model architecture capable of simultaneous head detection and head mesh reconstruction from a single image in a single step. Through extensive experimental evaluations, we demonstrate that models trained on our synthetic data achieve strong performance on real images. Furthermore, the versatility of our dataset makes it applicable across a broad spectrum of tasks, offering a general and comprehensive representation of human heads.

10.
bioRxiv (Bioinfo) 2026-06-11

TifBERT: a self-supervised foundation model for normalization-robust bulk RNA-seq representation learning

Bulk RNA sequencing remains central to translational genomics, yet foundation-model development has largely focused on single-cell data. Existing transformer approaches for bulk RNA-seq often rely on expression discretization, numerical reconstruction, external gene embeddings, or restricted gene sets, limiting robustness across normalization schemes and cohorts. Here, we introduce TifBERT, a self-supervised framework for full-transcriptome bulk RNA-seq representation learning. TifBERT converts each unordered expression profile into a sample-specific gene sequence using term frequency-inverse document frequency (TF-IDF) ordering, prioritizing genes that are both highly expressed within a sample and selectively expressed across the cohort. It is then pretrained using masked gene modeling, predicting gene identities from transcriptomic context rather than reconstructing expression values. Pretrained on harmonized TCGA Pan-Cancer data spanning five RNA-seq normalization schemes, TifBERT learns contextual representations across approximately 10,000 genes without expression binning, landmark-gene restriction, or external biological embeddings. Across 33 TCGA cancer types, TifBERT achieved 90.83% accuracy, 0.996 macro AUC-ROC, and 0.903 MCC. It also captured pathway-level biology, achieving mean sample-wise and pathway-wise Pearson correlations of 0.754 and 0.762 across 1,387 PARADIGM pathway activities. Independent evaluation on GTEx healthy tissues showed preservation of tissue-level transcriptomic structure without retraining. In comparison with existing models, TifBERT achieves competitive subtype discrimination with substantially greater stability and produces markedly richer embedding geometry (effective rank 95.6 versus 6.3), without requiring expression discretization or in-distribution pretraining exposure. Together, TifBERT provides a scalable, normalization-independent foundation model for reusable bulk transcriptomic representation learning

11.
arXiv (CS.CV) 2026-06-16

Null-Space Diffusion Distillation Unlocks Speed, Fidelity and Realism in Lensless Imaging

Lensless imaging reconstructs scenes from highly multiplexed measurements, resulting in a severely ill-posed inverse problem. In this work, we identify a fundamental trade-off between measurement consistency, perceptual quality, and inference speed across lensless reconstruction paradigms. Traditional methods favor consistency but produce perceptually degraded results, supervised approaches achieve high-quality reconstructions with fast inference but may violate physical constraints, and diffusion-prior methods achieve high perceptual quality and consistency–particularly when structured constraints such as range-null decomposition are used–but remain slow due to iterative sampling. Motivated by this observation, we propose Null-Space Diffusion Distillation (NSDD), a single-pass reconstruction model that distills structured diffusion-prior inference into an efficient feed-forward network. NSDD learns to produce high-quality reconstructions that preserve measurement consistency while avoiding costly iterative sampling. Experimental results demonstrate that NSDD achieves perceptual quality and consistency competitive with diffusion-prior methods, while providing significantly faster inference and offering a favorable balance across all three objectives. Furthermore, ablation experiments show that distilling the range–null decomposition improves reconstruction quality and robustness over unstructured full-reconstruction distillation, including on unseen real scenes. These results highlight the potential of structure-aware distillation for efficient lensless imaging. Code is available at github.com/JRCSAVSN/NullSpaceDiffusionDistillation.

12.
arXiv (CS.AI) 2026-06-19

Controlled Comparison of Machine Learning Models for Fault Classification and Localization in Power System Protection

arXiv:2510.00831v2 Announce Type: replace Abstract: The increasing complexity of modern power systems, driven by the integration of inverter-based and distributed energy resources, challenges the reliability of conventional protection schemes and motivates the use of machine learning for protection tasks. However, published results are often difficult to compare because datasets, sensing assumptions, and decision horizons vary across studies. This paper presents a controlled comparison of machine learning models for fault classification (FC) and fault localization (FL) under identical sensing, timing, and validation conditions on a common electromagnetic transient dataset, using decision windows of 10-50 ms to reflect protection-relevant time scales. For FC, the best-performing nonlinear models achieve F1 scores above 0.98 already at 10 ms, while lower-capacity models degrade at shorter horizons but improve with longer windows, indicating that relevant fault-type information is already present in the earliest transient. For FL, the top-performing models reach a stable localization error of about 10 % of normalized line length across all evaluated horizons, while weaker models form a clearly separated second performance tier. Line-resolved analysis shows that localization accuracy varies across grid segments, indicating topology-dependent difficulty rather than insufficient temporal context alone. These findings provide a controlled reference for comparing machine learning models across two protection tasks with fundamentally different information requirements.

13.
arXiv (CS.CV) 2026-06-16

CLAP: Contrastive Latent Action Pretraining for Learning Vision-Language-Action Models from Human Videos

Generalist Vision-Language-Action models remain constrained by the scarcity of robotic data relative to the abundance of human video demonstrations. Existing Latent Action Models attempt to use video data but often suffer from visual entanglement, encoding noise rather than manipulation skills. To address this limitation, we propose Contrastive Latent Action Pretraining (CLAP), a framework that first uses Act-VAE to learn an executable action-token vocabulary from robot trajectories and then aligns human visual transitions with this vocabulary through contrastive learning. This alignment maps unlabeled human videos into a physically grounded latent action space rather than reconstructing appearance. Building on the aligned tokens, we train CLAP-NTP as an autoregressive VLA using robot demonstrations and pseudo-labeled human videos, preserving instruction following and object generalization. For deployment and target-domain adaptation, we further introduce a post-training strategy that combines CLAP-RF, a Rectified Flow action head for low-latency continuous action chunk prediction, with Knowledge Matching regularization to preserve pretrained semantic knowledge during fine-tuning. Extensive experiments show that CLAP achieves strong performance against competitive baselines while enabling effective skill transfer from human videos to robotic execution.

14.
arXiv (CS.LG) 2026-06-17

Learning Credal Ensembles via Distributionally Robust Optimization

arXiv:2602.08470v3 Announce Type: replace Abstract: Credal predictors are models that are aware of epistemic uncertainty and produce a convex set of probabilistic predictions. They offer a principled way to quantify predictive epistemic uncertainty (EU) and have been shown to improve model robustness in various settings. However, most state-of-the-art methods mainly define EU as disagreement caused by random training initializations, which mostly reflects sensitivity to optimization randomness rather than uncertainty from deeper sources. To address this, we define EU as disagreement among models trained with varying relaxations of the i.i.d. assumption between training and test data. Based on this idea, we propose CreDRO, which learns an ensemble of plausible models through distributionally robust optimization. As a result, CreDRO captures EU not only from training randomness but also from meaningful disagreement due to potential distribution shifts between training and test data. Empirical results show that CreDRO consistently outperforms existing credal methods on tasks such as out-of-distribution detection across multiple benchmarks and selective classification in medical applications.

15.
medRxiv (Medicine) 2026-06-16

Care Delivery Gap framework: a proof-of-concept patient-reported measure of guideline-referenced care-process omissions in sickle cell disease

Abstract Background:Sickle cell disease (SCD) is concentrated in sub-Saharan Africa, where delivery of guideline-referenced care remains challenging. Current evaluation approaches rely largely on access indicators and clinical outcomes, which do not directly measure care delivery. We developed the Care Delivery Gap (CDG) framework, a patient-reported approach for identifying care-process omissions, and conducted a proof-of-concept study to assess feasibility and explore variation across income strata. Methods: We conducted a cross-sectional framework-development study involving a proof-of-concept sample of 52 individuals with SCD or caregivers recruited through clinics and moderated SCD communities across Africa, North America, and Europe between June 2025 and March 2026. The CDG framework assessed patient-reported omissions in specialist involvement, follow-up continuity, cardiovascular screening, and biochemical surveillance. Analyses were descriptive. Results: Substantial multi-domain care-process omissions were identified despite high reported healthcare engagement. Across geographic income strata, cardiovascular screening was reported by 4/35 (11%) LMIC versus 16/17 (94%) HIC participants, and regular follow-up within the preceding 12 months by 14/35 (40%) versus 16/17 (94%), respectively. High CDG scores, representing 1 omissions across three or four domains, occurred in 20/35 (57%) LMIC compared with 1/17 (6%) HIC participants. Similar disparities were observed across specialist review and vitamin B12 surveillance domains. Conclusion: A structured patient-reported framework identified multi-domain omissions in guideline-referenced SCD care, including among individuals reporting healthcare access. The divergence between access indicators and reported care delivery suggests that service contact alone may not reflect care quality. The framework provides a feasible foundation for future process-level quality measurement in high-burden settings.

16.
arXiv (CS.LG) 2026-06-15

Concatenated Matrix SVD: Compression Bounds, Incremental Approximation, and Error-Constrained Clustering

arXiv:2601.11626v2 Announce Type: replace-cross Abstract: Large collections of matrices arise throughout modern machine learning, signal processing, and scientific computing, where they are commonly compressed by concatenation followed by truncated singular value decomposition (SVD). This strategy enables parameter sharing and efficient reconstruction and has been widely adopted across domains ranging from multi-view learning and signal processing to neural network compression. However, it leaves a fundamental question unanswered: which matrices can be safely concatenated and compressed together under explicit reconstruction error constraints? Existing approaches rely on heuristic or architecture-specific grouping and provide no principled guarantees on the resulting SVD approximation error. In the present work, we introduce a theory-driven framework for compression-aware clustering of matrices under SVD compression constraints. Our analysis establishes new spectral bounds for horizontally concatenated matrices, deriving global upper bounds on the optimal rank-$r$ SVD reconstruction error from lower bounds on singular value growth. The first bound follows from Weyl-type monotonicity under blockwise extensions, while the second leverages singular values of incremental residuals to yield tighter, per-block guarantees. We further develop an efficient approximate estimator based on incremental truncated SVD that tracks dominant singular values without forming the full concatenated matrix. Therefore, we propose three clustering algorithms that merge matrices only when their predicted joint SVD compression error remains below a user-specified threshold. The algorithms span a trade-off between speed, provable accuracy, and scalability, enabling compression-aware clustering with explicit error control.

17.
arXiv (quant-ph) 2026-06-12

Diffusive Dynamics of Nonstabilizerness

arXiv:2606.13606v1 Announce Type: new Abstract: Symmetries shape the quantum-information dynamics of many-body systems, but their effect on nonstabilizerness, the resource complementary to entanglement, is less understood. We compute the stabilizer Rényi entropy, a measure of nonstabilizerness, in $\mathrm{U}(1)$-symmetric one-dimensional random circuits. The disorder-averaged dynamics is captured by a four-replica tensor network, which we evaluate by $S_4$-adapted infinite time-evolving block decimation (iTEBD) directly in the thermodynamic limit. Together with a hydrodynamic argument, our results identify a diffusive universality class for the late-time approach of nonstabilizerness to its random-state value, with the stabilizer Rényi entropy gap closing as $1/t$. The same scaling is verified in an energy-conserving nonintegrable Ising chain. More broadly, our framework provides a hydrodynamic perspective on nonstabilizerness generation and offers insight into the design of approximate Haar-random states in Hamiltonian dynamics.

18.
arXiv (CS.LG) 2026-06-18

A Streaming Sparse Cholesky Method for Derivative-Informed Gaussian Process Surrogates Within Digital Twin Applications

arXiv:2511.00366v2 Announce Type: replace-cross Abstract: Digital twins are developed to model the behavior of a specific physical asset (or twin), and they can consist of high-fidelity physics-based models or surrogates. A highly accurate surrogate is often preferred over multi-physics models as they enable forecasting the physical twin future state in real-time. To adapt to a specific physical twin, the digital twin model must be updated using in-service data from that physical twin. In this paper, we combine and extend several previous surrogate-related advancements with the goal of demonstrating an end-to-end digital twin (DT) solution for predicting performance of an aircraft structure (the physical asset). To this end, we extend Gaussian process (GP) models to include derivative data, for improved accuracy, with dynamic updating to ingest physical twin data during service. Including derivative data, however, comes at a prohibitive cost of increased covariance matrix dimension. We circumvent this issue through our modified dynamic sparse Cholesky linear system solver. Numerical experiments demonstrate that the prediction accuracy of the derivative-enhanced sparse Cholesky GP method produces improved models upon dynamic data additions. Lastly, we demonstrate the developed algorithm within a DT framework to model fatigue crack growth in an aerospace vehicle, thereby exhibiting through our assembled engineered system how digital twin technologies can be combined in practice.

19.
arXiv (CS.AI) 2026-06-12

The Query Channel: Information-Theoretic Limits of Masking-Based Explanations

arXiv:2604.16689v2 Announce Type: replace Abstract: Masking-based post-hoc explanation methods, such as KernelSHAP and LIME, estimate local feature importance by querying a black-box model under randomized perturbations. This paper formulates this procedure as communication over a query channel, where the latent explanation acts as a message and each masked evaluation is a channel use. Within this framework, the complexity of the explanation is captured by the entropy of the hypothesis class, while the query interface supplies information at a rate determined by an identification capacity per query. We derive a strong converse showing that, if the explanation rate exceeds this capacity, the probability of exact recovery necessarily converges to one in error for any sequence of explainers and decoders. We also prove an achievability result establishing that a sparse maximum-likelihood decoder attains reliable recovery when the rate lies below capacity. A Monte Carlo estimator of mutual information yields a non-asymptotic query benchmark that we use to compare optimal decoding with Lasso- and OLS-based procedures that mirror LIME and KernelSHAP. Experiments reveal a range of query budgets where information theory permits reliable explanations but standard convex surrogates still fail. Finally, we interpret super-pixel resolution and tokenization for neural language models as a source-coding choice that sets the entropy of the explanation and show how Gaussian noise and nonlinear curvature degrade the query channel, induce waterfall and error-floor behavior, and render high-resolution explanations unattainable.

20.
arXiv (CS.CL) 2026-06-17

Security and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs Respond

Large language models (LLMs) are widely used to fulfill users' information needs; users ask LLMs about the weather, pose educational questions, and consult them for legal assistance. One particularly understudied area is digital security and privacy (S&P), where users may seek LLMs' help on how to secure their online accounts or protect their computers from cyber attacks. To the best of our knowledge, no prior study has collected or analyzed the S&P questions users ask LLMs; prior research on LLM response quality relied on expert-authored S&P misconceptions or FAQs rather than user queries. Drawing from WildChat, a dataset of 3.2M user-LLM conversations collected in the wild, our study identifies 14,727 S&P prompts and categorizes them into nine categories covering a wide range of S&P topics. From the S&P prompts, we sampled 450 and performed a thematic analysis to characterize the S&P questions users ask LLMs. Separate from the thematic analysis, we curated 270 advice-seeking S&P prompts, where users ask for recommendations, guidance, or specific S&P information. We measured LLM response quality and consistency when posing the prompt to LLMs 10 times. We found that commercial LLMs outperform open-weight models (GPT 5.5 provided "good enough" responses on 98% of prompts; Llama 4 on 47%). However, among prompts that received high-quality responses on average, commercial models sometimes produce contradictory responses across runs, risking confusing or misleading users.

22.
medRxiv (Medicine) 2026-06-10

Assessment of the accuracy of lung lesions diagnosis in adolescents with osteosarcoma using artificial intelligence

Background. Lung metastases in osteosarcoma (OS) are the main cause of the death. The accuracy of the diagnosis of nodules by computed tomography (CT) of the lungs is critically important for determining the disseminated stage of the disease and planning surgical treatment. The use of artificial intelligence (AI) in the search for lung nodules increases the accuracy of diagnosis and reduces the chance of missing metastases. Objective: to evaluate the accuracy of lung nodules diagnosis in adolescents with OS using AI. Methods. A retrospective assessment of CT scans of adolescents with OS was performed. A pathological nodule with an average size of [≥]4 mm was considered a target finding. The diagnostic accuracy of an AI algorithm previously trained on an adult dataset was evaluated, and the number of false positives (FP) and false negatives (FN) was determined. Sensitivity, specificity, accuracy, area under the ROC curve (AUC), positive predictive value, negative predictive value, and F1-measure were calculated. Based on the obtained results, the effectiveness of the algorithm was assessed. Results. 248 CT scans of adolescents with OS were evaluated. The following results were obtained: in 5 cases, the AI algorithm showed a FP result (2.02%), in 34 cases, it showed a FN result (13.71%), and in 209 cases, a correct result (both true positive and true negative) (84.27%). The diagnostic accuracy of the algorithm was 0.843 (95% CI 0.794-0.887). The application of the AI algorithm in the practice of an X-ray doctor in a specific clinical task would allow to increase the sensitivity from 0.805 to 0.891, while ensuring an absolute decrease in the number of FN results by 8.59% and a relative decrease by 44%. Conclusion. The obtained results confirm the practical value of the application of the AI algorithm and justify the implementation of AI-assisted systems in the diagnostic protocols for lung metastases in adolescents with OS.

23.
arXiv (CS.CV) 2026-06-16

Mutual Distillation of Dual-Foundation Models for Semi-Supervised PET/CT Segmentation

Organ segmentation from PET/CT is critical for quantitative analysis and radiotherapy planning in oncology. To ease the high annotation cost of PET/CT segmentation, semi-supervised learning (SSL) provides a practical and effective solution for developing deep models with limited labeled data. Recent developments in visual foundation models have demonstrated remarkable adaptability with improved efficiency. In this work, we propose a mutual distillation framework that seamlessly exploits both structural and functional foundation models, which act as modality-specific generalists for distilling knowledge from structural CT and metabolic PET imaging. By bridging the gap between the task-specific precision of student models and the segmentation priors of generalist foundation models, we propose MuDuo, a mutual distillation framework that synergistically leverages SAM-Med3D for CT and SegAnyPET for PET to distill their knowledge into a lightweight student network. Our approach eliminates the need for manual prompts while maximizing the utility of unlabeled data for automatic segmentation, achieving state-of-the-art performance on the AutoPET dataset with only 5 labeled cases. Our source code is available at https://github.com/Wu-beining/MuDuo.

24.
arXiv (quant-ph) 2026-06-15

Dissipation-induced superradiance in matter coupled to a self-interacting cavity

arXiv:2606.14526v1 Announce Type: new Abstract: Light-matter interactions are often modeled via the Dicke model, namely, by two-level systems coupled to a cavity mode. Alas, the threshold for superradiance is often experimentally inaccessible or hindered by light's diamagnetic term. Here, within the Dicke setting, we consider self-interacting light in a cavity, modeled by a photonic Kerr nonlinearity. We show that negative Kerr nonlinearity gives rise to a low-threshold superradiant phase with spin inversion. While unstable in a closed system, cavity dissipation stabilizes this lit phase, opening avenues for lasing and bath-engineered phases.

25.
arXiv (CS.LG) 2026-06-16

How Post-Training Shapes Biological Reasoning Models

arXiv:2606.16517v1 Announce Type: new Abstract: Scientific reasoning models for biology combine language models with foundation models trained on multimodal biological data, including DNA, RNA, and proteins. These models are built through post-training, yet how each stage shapes reasoning and generalization remains poorly understood. We study when post-training improves performance and when it induces over-specialization. Across genomics, transcriptomics, and proteins, we train and evaluate more than 100 biological reasoning models under controlled variation in backbone, continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL), measuring both in-domain (ID) and out-of-domain (OOD) performance. We find that each post-training stage reshapes generalization in a distinct way rather than contributing uniform gains. CPT improves downstream performance by aligning models with biological language. SFT consistently increases ID performance but causes OOD performance to peak early and decline as models fit the training distribution. RL, when applied to strong SFT checkpoints with aligned rewards, improves OOD performance and partially recovers generalization. These results show that biological reasoning does not improve monotonically with additional supervision or compute. Instead, performance depends on how training stages are composed. Under fixed post-training budgets, the strongest ID-OOD trade-off comes from brief SFT, larger RL allocations, and asymmetric adaptation capacity across stages.