Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
medRxiv (Medicine) 2026-06-16

Presurgical immune biomarkers associated with pain intensity and pain interference recovery after total knee arthroplasty: findings from the PRIME-KNEE study

Chronic postsurgical pain (CPSP) prevalence after total knee arthroplasty (TKA) is >20%. Circulating immune biomarkers are known factors of musculoskeletal pain but poorly understood as CPSP predictors. This prospective, longitudinal study of 203 patients s/p TKA tested presurgical plasma biomarkers associated with 6-month CPSP, using promising approaches from geriatrics biomarker research: expected recovery differential (ERD; resilience outcome) and penalized, machine-learning regularization modeling (elastic net and LASSO regression). Forty-nine presurgical candidate biomarkers were considered. CPSP was operationalized using ERDs built around PROMIS pain intensity and pain interference, which quantified the difference between observed and expected recovery after accounting for demographic, comorbidity, reserve, and perioperative factors. Plasma/ERDs from ~130 patients revealed 13 biomarkers with the highest selection stability criteria, and either positive or negative (+/-) associations with ERDs. Interleukin (IL) 5 (-) and Lipopolysaccharide-Binding Protein (LBP; +) were associated with both ERDs. Unique associations with pain intensity ERD included Cytomegalovirus-Specific IgG Negative (CMV IGg-; -), Macrophage Inflammatory Protein-1 Beta (MIP1b; -), IL12p70 (-, Cluster of Differentiation 30 (sCD30;-), Interferon alpha 2a (IFN2a;+), and Leukemia Inhibitory Factor (LIF;+). Unique associations with pain interference ERD included Lipopolysaccharide (LPS;-), Activin A (-), IL8 (-), Serum Amyloid A (SAA;-), and IL7 (+). Protein-protein interaction analyses and topology motifs suggest a centralized network with higher-than-expected connectivity, involving IL5, IL7, IL8, MIP1{beta}, and IFN2a, among others. This study proposes rigorous yet feasible approaches to expedite pain biomarker research, and introduces presurgical biomarkers t0 consider in future TKA-CPSP biosignature derivation.

02.
arXiv (CS.CV) 2026-06-11

Echoes of the Prior: A Computational Phenomenology of Forgetting

Memory is not merely the storage of data; it is the scaffolding of reality. When biological memory fades, the world does not simply turn black; it regresses into an unrecognizable chaos. Echoes of the Prior is an interactive installation that attempts to visualize this subjective phenomenology of forgetting. By inducing controlled synaptic decay within a Feed-Forward 3D Reconstruction model, we create an artistic analogy for the erosion of the brain's predictive priors. We position the Neural Network not as a tool for engineering, but as a cognitive proxy - a silicon brain whose structural degeneration evokes the disorienting, poetic, and terrifying experience of losing one's grip on the world. Ultimately, we offer this framework as a catalyst, inviting the wider community to explore the uncharted potential of neuromorphic aesthetics in visualizing the fragility of intelligence. Interactive demo see https://decart-4d.github.io/.

03.
arXiv (math.PR) 2026-06-17

Analysis of the asymmetric shelf shuffle

arXiv:2606.18047v1 Announce Type: new Abstract: In an asymmetric shelf shuffle, a deck of $n$ cards is dealt sequentially from the bottom and assigned one of the $m$ shelves uniformly at random. The card is placed at the top of the assigned shelf with probability $p$, and at the bottom of the assigned shelf with probability $(1-p)$. Analysis of the shelf shuffle has gained much attention recently, and the case $p=1/2$ was first treated by Diaconis–Fulman–Holmes [Ann. Appl. Prob. 23 (2013), no. 4, 1692–1720]. In this paper, we extend the analysis of the shelf shuffle to general $p\in (0, 1)$. In particular, we study the distribution of cycles, cycle lengths, number of descents, number of valleys, number of inversions, and the RSK shape of a permutation obtained from an asymmetric shelf shuffle. Our results extend the analysis of Diaconis–Fulman–Holmes to arbitrary $p$. Furthermore, our analysis of the distribution of descents and inversions is new even for $p=1/2$.

04.
arXiv (CS.CV) 2026-06-15

SA4Depth: Consistent Pose-Depth Scale Alignment for Self-Supervised Monocular Depth Estimation

Self-supervised depth estimation from monocular sequences relies on the joint learning of a depth and a pose network. Despite abundant research done to improve the depth network, efforts on the pose remain limited. In this context, even when depth is estimated up to scale, we highlight the importance of the alignment between the scene scales estimated by the pose and depth nets. Then, we introduce SA4Depth, an approach to improve this alignment and boost the depth predictions while keeping the inference time unchanged. Our proposed method uses the depth estimated during training to reproject learnable visual features across consecutive frames and refine the pose estimates by reducing feature alignment residuals. With our method, the estimated scene scales by the separate depth and pose networks are aligned, and the prediction scale consistency is improved across different sequences. Our differentiable refinement integrates seamlessly into existing self-supervised pipelines and substantially improves their depth estimates. We demonstrate this with extensive experiments both outdoors and indoors on KITTI, Cityscapes, and NYUv2. Additionally, results on KITTI Odometry confirm the effectiveness of our pose refinement. Our code is available at https://github.com/Runningchauncey/SA4Depth .

05.
arXiv (quant-ph) 2026-06-16

Optical Creation of Synthetic Microgravity for Quantum Degenerate Gases

arXiv:2606.14985v1 Announce Type: cross Abstract: Microgravity environments provide unique opportunities for ultracold-atom experiments by enabling long interrogation times and reduced acceleration-induced dynamics. However, their realization has largely been restricted to specialized facilities such as drop towers, sounding rockets, and space-based laboratories. Here we realize synthetic microgravity for quantum degenerate gases using optically engineered force landscapes that compensate Earth's gravity to the milli-g level while maintaining continuous confinement of the atomic ensemble. These force landscapes are generated by dynamically painted optical dipole potentials and calibrated in situ through Bloch oscillations in a vertical optical lattice, enabling precise control of the residual acceleration. We use this capability to demonstrate matter-wave beam splitting with arm separations of several hundred microns. We further implement a Bloch-band atom interferometer in which interaction-induced dephasing is strongly suppressed through controlled three-dimensional expansion in the synthetic microgravity potential. This reduction of mean-field effects restores near-$\sqrt{N}$ scaling of interferometric sensitivity for large quantum degenerate ensembles. Our results establish a versatile platform for realizing synthetic microgravity with trapped quantum gases in terrestrial laboratories, bringing the advantages of microgravity experiments to continuously operating systems and opening new opportunities for quantum sensing, matter-wave interferometry, and precision measurements.

06.
arXiv (CS.LG) 2026-06-16

Diversity-Driven Offline Multi-Objective Optimization via Nested Pareto Set Learning

arXiv:2606.15115v1 Announce Type: new Abstract: Multi-objective optimization (MOO) has emerged as a powerful approach to solving complex optimization problems involving multiple objectives. In many practical scenarios, function evaluations are unavailable or prohibitively expensive, necessitating optimization solely based on a fixed offline dataset. In this setting, known as offline MOO, the goal is to find out the Pareto set without access to the true objective functions. This setting suffers from the out-of-distribution (OOD) issue, where the surrogate model is not accurate for unseen designs. Due to the OOD issue, surrogate errors may cause the optimizer to select solutions that do not lie on the true Pareto front and are biased toward its extremes. To address this, this paper proposes Diversity-driven Offline Multi-Objective Optimization (DOMOO), which aims to find out a diverse and high-quality set of solutions. First, DOMOO incorporates an accumulative risk control module that estimates the potential risk of candidate solutions and alleviates the OOD issue between the training data and the generated solutions. In addition, a nested Pareto set learning (PSL) strategy is proposed to jointly learn preference and PSL parameters, then optimize them, enabling adaptation to diverse Pareto front geometries. To further enhance solution quality, we design a diversity-driven selection strategy that extracts a representative and well-distributed set of final solutions. To achieve this diversity-driven selection strategy, we propose $IGD_offline$, a tailored indicator for the offline setting that considers both diversity and convergence, and avoids the bias of hypervolume indicator. Extensive experiments on synthetic and real-world benchmarks show that DOMOO achieves the best average rank across tasks in both convergence and diversity among the compared methods.

07.
arXiv (CS.AI) 2026-06-19

PrefSQA: Pairwise Preference Prediction for Speech Quality Assessment and the Critical Role of High Quality Datasets

arXiv:2606.19597v1 Announce Type: cross Abstract: Mean opinion scores (MOS) are widely used for speech quality assessment, yet scalar labels are sensitive to rater variability and listening test differences. This introduces labeling noise, which limits the reliability of MOS prediction. Preference prediction reduces this variability as listeners compare signals directly, producing cleaner labels. We study MOS-free preference prediction and propose PrefSQA, which incorporates uncertainty-aware logits, an impairment attention head, and a module based on non-matching-reference comparisons. We use and refine five datasets, including MOS-derived and low-noise simulated sets with matching and non-matching content, experiment with human preference sets, and test on unseen data. Experiments show small improvements on MOS-derived data, while other sets reveal clear improvement over the baselines, highlighting the value of high-quality preference data and demonstrating the effectiveness of the proposed method.

08.
arXiv (quant-ph) 2026-06-16

Dressed Floquet scars from protected zero modes in a Rydberg chain

arXiv:2606.15605v1 Announce Type: cross Abstract: In this Letter, we present an approximate analytic construction of two zero quasienergy quantum many-body scars in a periodically driven model of Rydberg atoms on a ring, which persist over a range of driving amplitudes and frequencies for finite sizes. An index theorem protects an exponentially large number (in system size) of exact zero energy modes of the Floquet Hamiltonian in this setting. Unlike most of these zero modes which continuously change with drive parameters, these two quantum many-body scars retain the memory of particular states. They can be expressed as {\it dressed versions} of two contrasting states, the Rydberg vacuum and a unitarily rotated variant of a volume-law scar [Ivanov and Motrunich, Phys. Rev. Lett. {\bf 134}, 050403 (2025)], respectively. We provide an analytic understanding of their existence using a Floquet perturbation theory and show their resilience beyond the perturbative regime using exact diagonalization in finite systems. Our study provides insight into the structure of protected zero modes in interacting Floquet settings.

09.
arXiv (CS.CV) 2026-06-18

A Multi-Domain Benchmark for Detecting AI-Generated Text-Rich Images from GPT-Image-2

Text-rich images often contain privacy-sensitive, transactional, or decision-relevant information. As recent multimodal image generation models become increasingly capable of synthesizing realistic textual content and structured visual designs, detecting AI-generated text-rich images has become an important challenge for digital trust and content authenticity. Existing benchmarks, however, largely focus on object-centric images and provide limited coverage of scenarios where textual semantics and layout organization are central. In this paper, we introduce a multi-domain benchmark for detecting text-rich images generated by OpenAI's GPT Image 2. The benchmark contains 8,602 images across six representative categories: commercial posters, infographics, academic posters, receipts, tables, and UI screenshots. Using this benchmark, we evaluate five representative AI-generated image detectors in a zero-shot setting and analyze their overall, category-wise, and post-processing robustness. Our results show that detector performance is highly domain-dependent: methods that perform well in some categories often fail on others, and even the strongest conventional detector exhibits severe sensitivity to JPEG compression. We further conduct an exploratory evaluation with a multimodal vision-language model, revealing both its promise and its limitations on structured formats. These findings highlight the need for text- and layout-aware detection methods for modern AI-generated images. Our dataset is released at XXX.

10.
arXiv (CS.AI) 2026-06-15

Silent Failures in Federated Personalization of Foundation Models

arXiv:2606.00947v2 Announce Type: replace-cross Abstract: Foundation models are increasingly personalized on decentralized private data through federated learning and are now deployed at scale under growing regulatory requirements for post-market monitoring. We argue that this convergence creates a distinct and under-recognized class of trustworthiness failures, which we term "Silent Failures." These include amplified bias, fairness collapse, and alignment erosion that may remain difficult to detect because federated learning's privacy constraints limit visibility into model behavior. A landscape analysis of existing benchmarks reveals a structural divide. Federated benchmarks evaluate system performance but provide limited insight into model behavior, whereas centralized trustworthiness benchmarks assess behavior but require model access incompatible with federated privacy. We introduce a taxonomy of six silent failure modes arising from the interaction of foundation model personalization, dataset shift, and core federated constraints. Our analysis shows that privacy-preserving training alone is insufficient for trustworthy deployment. We conclude with a research agenda for privacy-preserving behavioral evaluation and propose that silent failures become a standard diagnostic category for trustworthy federated artificial intelligence.

11.
arXiv (CS.CL) 2026-06-16

Simplifying the Modeling of Arbitrary Conditionals in Natural Language

Causal Transformers model sequences through an autoregressive factorization of the joint distribution, which enables efficient left-to-right decoding and conditional likelihood computation. However, they cannot tractably sample from or evaluate arbitrary conditionals – e.g., a block of text conditioned on past and future tokens. Recent work aims to solve this problem through novel architectures, but they often lead to sub-optimal modeling of such conditionals and degraded generations. We propose Arbitrary Conditionals GPT (AC-GPT) which introduces a simple modification to standard causal Transformers to enable evaluating and sampling from arbitrary conditionals – including past, future, and mixed contexts – within a single forward pass. Unlike prior approaches, our method preserves the standard left-to-right ordering and next-token prediction objective essential for both strong performance and efficient training on natural language. Crucially, this compatibility allows existing LLMs to be fine-tuned for arbitrary conditioning. Our empirical results indicate that our method outperforms baselines on modeling arbitrary conditionals, without degrading standard left-to-right performance.

12.
arXiv (CS.CV) 2026-06-12

EvTexture++: Event-Driven Texture Enhancement for Video Super-Resolution

Event-based vision has drawn increasing attention owing to its distinctive properties, including ultra-high temporal resolution and extreme dynamic range. Recent works have introduced it to video super-resolution (VSR) to enhance flow estimation and temporal alignment. In contrast, this paper shifts the focus of event signals from motion refinement to texture enhancement in VSR. We propose EvTexture++, the first event-driven framework dedicated to texture enhancement in VSR. It leverages high-frequency spatiotemporal details from events to improve texture recovery. EvTexture++ incorporates a customized texture enhancement branch, along with an iterative texture enhancement module that progressively exploits high-temporal-resolution event information for texture restoration. This enables gradual refinement of texture regions across iterations, yielding more accurate and detailed high-resolution outputs. Besides intra-frame texture recovery, large motions could degrade inter-frame temporal consistency, particularly in texture regions, leading to texture flickering. To mitigate this, we further exploit the continuous-time motion cues of events to enhance temporal consistency, introducing a temporal texture alignment module that estimates event-guided texture-aware flow for precise inter-frame texture alignment. Moreover, EvTexture++ is designed as a plug-and-play tool to flexibly boost the performance of existing VSR models. Experiments on five datasets demonstrate that EvTexture++ achieves state-of-the-art performance. When integrated into recent VSR models, it yields significant improvements, with gains of up to 1.55 dB in PSNR on the texture-rich Vid4 dataset. Code: https://github.com/DachunKai/EvTexture.

13.
arXiv (math.PR) 2026-06-19

The t-Split Two-Periodic Aztec Diamond Model

Authors:

arXiv:2606.19507v1 Announce Type: new Abstract: In this work we consider an Aztec diamond model split into two unequal regions which are asymptotically fixed in size. Each region is weighted with a distinct two-periodic weighting. We refer to this model as the t-split two-periodic Aztec diamond, to signify its difference from the previous work title Split Two-Periodic Aztec Diamond, where the model was split into two equal regions. We derive an integral expression for the correlation kernel of the model and give a partial description of the scaling limit behavior, along with a conjecture for the remainder. We refer to the larger and smaller sides of the model as the dominant and non-dominant sides, and to the location of the weight change as the interface. The dominant side exhibits a limit shape that depends only on its own weighting and is identical to that of the two-periodic Aztec diamond, while the non-dominant side appears to have a novel limit shape that depends on both weightings and the location of the interface. Lastly, we consider the complete limit shape in the case where the dominant side two-periodic parameter goes to 0.

14.
arXiv (CS.CL) 2026-06-12

LLMs Can Better Capture Human Judgments–With the Right Prompts

Are large language models (LLMs) bad at capturing human judgment? Two commonly stated limitations are that LLMs fail to capture full distributions of responses, and that their judgments are unstable across wording variations. We demonstrate simple prompting strategies that mitigate these limitations. Across two datasets–a U.S.-representative set of 144 moral scenarios and 38 moral beliefs from the International Social Survey Programme's Family and Changing Gender Roles module covering 32 countries–we show how simple elicitation techniques help improve AI-human alignment. First, prompting models to report standard deviations and response proportions recovers the full range of human responses better than common strategies. Second, ensuring scenarios are clear to human participants–as reflected in human confusion ratings–boosts model alignment, and LLMs can track human confusion ratings. At the same time, we find that LLMs' estimates of their own error are poorly calibrated, though they can predict human variability relatively well. These results suggest that asking better questions to LLMs can yield better answers.

15.
arXiv (CS.LG) 2026-06-18

Decomposing Prediction Mechanisms for In-Context Recall

arXiv:2507.01414v2 Announce Type: replace Abstract: We introduce a new family of toy problems that combine features of linear-regression-style continuous in-context learning (ICL) with discrete associative recall. We pretrain transformer models on sample traces from this toy, specifically symbolically-labeled interleaved state observations from randomly drawn linear deterministic dynamical systems. We study if the transformer models can recall the state of a sequence previously seen in its context when prompted to do so with the corresponding in-context label. Taking a closer look at this task, it becomes clear that the model must perform two functions: (1) identify which system's state should be recalled and apply that system to its last seen state, and (2) continuing to apply the correct system to predict the subsequent states. Training dynamics reveal that the first capability emerges well into a model's training. Surprisingly, the second capability, of continuing the prediction of a resumed sequence, develops much earlier. Via out-of-distribution experiments, and a mechanistic analysis on model weights via edge pruning, we find that next-token prediction for this toy problem involves at least two separate mechanisms. One mechanism uses the discrete symbolic labels to do the associative recall required to predict the start of a resumption of a previously seen sequence. The second mechanism, which is largely agnostic to the discrete symbolic labels, performs a "Bayesian-style" prediction based on the previous token and the context. These two mechanisms have different learning dynamics. To confirm that this multi-mechanism (manifesting as separate phase transitions) phenomenon is not just an artifact of our toy setting, we used OLMo training checkpoints on an ICL translation task to see a similar phenomenon: a decisive gap in the emergence of first-task-token performance vs second-task-token performance.

16.
arXiv (CS.CV) 2026-06-18

DART: A design-aware microfluidic chip paradigm for real-time live-cell image analysis

High-throughput microfluidic live-cell imaging generates rich single-cell data. Yet semi-automated procedures for locating regions of interest (RoIs), each containing one cell population, and removing surrounding microfluidic structures from recorded images, scale with the number of RoIs. This prevents real-time image analysis and delays time-to-insight by hours to days. We introduce the Design-Aware and Real-Time capable (DART) paradigm for microfluidic cultivation chips, which aligns the CAD blueprint with the physical chip and thereby enables throughput-independent localization of all RoIs and fully automated image processing across diverse RoI geometries and chip layouts. DART establishes this alignment through embedded fiducial markers and deep-learning-based marker detection. We validate DART using the Swiss Army Knife chip, which combines eight structurally distinct RoI designs across 1164 RoI locations. DART localizes all RoIs in five minutes, removes microfluidic structures from raw microscopy images in 40 ms, and performs fully automated image analysis, including cell segmentation, in under 1.1 s per image. Together, these capabilities establish DART as an end-to-end hardware-software paradigm with real-time-capable analysis that paves the way toward closed-loop and outcome-driven smart microscopy.

17.
arXiv (CS.LG) 2026-06-16

Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

arXiv:2606.15219v1 Announce Type: new Abstract: In this work, we tackle the following question: Can neural networks trained with gradient-based methods achieve the optimal computational-statistical tradeoff in learning Gaussian single-index models? Prior research has shown that any polynomial-time algorithm under the statistical query (SQ) framework requires $\Omega(d^{s^\star/2}\lor d)$ samples, where $s^\star$ is the generative exponent representing the intrinsic difficulty of learning the underlying model. However, it remains unknown whether neural networks can achieve this sample complexity. Inspired by prior techniques such as label transformation and landscape smoothing for learning single-index models, we propose a unified gradient-based algorithm for training a two-layer neural network in polynomial time. Our method is adaptable to a variety of loss and activation functions, covering a broad class of existing approaches. We show that our algorithm learns a feature representation that strongly aligns with the unknown signal $\theta^\star$, with sample complexity $\widetilde{O} (d^{s^\star/2} \lor d)$, matching the SQ lower bound up to a polylogarithmic factor for all generative exponents $s^\star\geq 1$. Furthermore, we extend our approach to the setting where $\theta^\star$ is $k$-sparse for $k = o(\sqrt{d})$ by introducing a novel weight perturbation technique that leverages the sparsity structure. We derive a corresponding SQ lower bound of order $\widetilde{\Omega}(k^{s^\star})$, matched by our method up to a polylogarithmic factor. Our framework, especially the weight perturbation technique, is of independent interest, and suggests potential gradient-based solutions to other problems such as sparse tensor PCA.

18.
arXiv (CS.CV) 2026-06-11

Spatially Selective Self-Training for Unsupervised Building Change Detection

Unsupervised building change detection aims to learn building-change masks from unlabeled bi-temporal remote sensing images. Existing label-free methods often follow a discrepancy-to-mask paradigm, directly using temporal differences, frozen foundation-model responses, prompt-based outputs, or post-processing results as final change maps. Although these strategies provide annotation-free cues, they do not learn a task-specific building-change detector and remain vulnerable to the gap between generic temporal discrepancies and building-defined structural changes. In practice, such discrepancies are often noisy and task-irrelevant, as appearance shifts, registration errors, and non-building modifications can produce strong but misleading responses. To address this problem, we propose SST-CD, a spatially selective self-training framework that reformulates fully label-free building change detection as end-to-end detector learning under noisy pseudo supervision. SST-CD uses temporal discrepancies as candidate pseudo labels and trains the detector only on spatially reliable pixels, whose reliability is estimated by a local consistency criterion that filters inconsistent regions from supervision. To further stabilize noisy self-training, a lightweight feature adapter recalibrates bi-temporal features, while a prototype-based decoder produces compact change and no-change representations. Experiments on LEVIR-CD, WHU-CD, and DSIFN-CD show that SST-CD achieves F1 scores of 83.08%, 91.69%, and 86.60%, respectively, outperforming existing unsupervised and label-free baselines.

19.
medRxiv (Medicine) 2026-06-12

Microbial etiology, antibiotic susceptibility profiles, and multidrug resistance of urinary tract infections at a secondary healthcare facility in Ghana

Background: Rising antibiotic resistance challenges empirical therapies for urinary tract infections (UTIs). This study evaluated the microbial etiology, susceptibility profiles, and multidrug resistance (MDR) patterns of uropathogens among outpatients at the Berekum Holy Family Hospital, Ghana. Methods: This cross-sectional study (February to August 2021) screened 263 symptomatic outpatients. Mid-stream urine samples underwent quantitative culture, biochemical identification, and antimicrobial susceptibility testing via the Kirby-Bauer disc diffusion method following the 2021 CLSI guidelines. Results: Significant bacteriuria prevalence was 22.8% (60/263). UTIs predominated in females (78.3%, 47/60; p = 0.1501) and individuals [≥]45 years (33.3%, 20/60). Gram-negative rods accounted for 90.0% of isolates, primarily Escherichia coli (26.7%), Citrobacter spp. (25.0%), and Enterobacter spp. (21.7%); Staphylococcus aureus (10.0%) was the only Gram-positive pathogen. Extreme phenotypic resistance was observed against piperacillin/tazobactam (98.3%), cefotaxime (93.3%), tetracycline (88.3%), and cefoperazone (85.0%). Conversely, highest therapeutic susceptibilities were retained by amikacin (78.3%), levofloxacin (61.7%), and gentamicin (58.3%). Conclusion: The high prevalence of MDR uropathogens against advanced beta-lactamase inhibitor combinations and cephalosporins necessitates an immediate re-evaluation of regional empirical protocols. Amikacin, levofloxacin, and gentamicin remain viable options prior to culture confirmation. These findings establish a crucial phenotypic baseline to guide localized prescribing policies and regional antimicrobial resistance tracking strategies.

20.
medRxiv (Medicine) 2026-06-15

Long-read sequencing enables high-accuracy mitochondrial heteroplasmy detection in Parkinson's disease

Background: Low-frequency heteroplasmic mitochondrial DNA (mtDNA) variants are associated with aging and neurological diseases, including Parkinson's disease (PD). Targeted deep mtDNA sequencing using PacBio HiFi long reads has the potential to resolve heteroplasmy across the full mitochondrial genome with high accuracy. Methods: To validate Vega PacBio sequencing for detecting mtDNA heteroplasmy, we analyzed four predefined mixtures of two mtDNA haplotypes. We generated a single long-range PCR amplicon covering the entire mitochondrial genome. These amplicons were mixed at predefined ratios (minor mixture haplotype component: 5%, 2%, 1%, and 0.1%). Variant calling was performed using Mutserve2, and accuracy was assessed by calculating the F1 score from comparisons between expected and detected variants. Full-length mtDNA PacBio sequencing was applied to investigate heteroplasmy across fibroblast passages derived from five LRRK2 p.Gly2019Ser variant carriers (n=3 affected with PD and n=2 unaffected carriers). Changes in mtDNA heteroplasmy level and variant load were assessed longitudinally using a linear mixed model. Results: The single-amplicon approach enabled full-length haplotype resolution without amplification bias associated with overlapping PCR strategies. The F1 score of the predefined mixtures was 1.0 for heteroplasmy levels between 5% and 1% and remained high (0.91) at 0.1%. We detected n=10/62 variants discordant with the Illumina reference at the 0.1% mixture, but sensitivity remained very high at 1.00 in that mixture. Detected minor variants closely matched expected heteroplasmy levels, with average variant levels of 0.057 (5%), 0.022 (2%), 0.011 (1%), and 0.001 (0.1%). Across twelve fibroblast passages, we observed fewer mtDNA heteroplasmic variants ({beta}=-3.2, p=0.026). Increased heteroplasmic variant load over time was also associated with older age ({beta}=1.50, p=0.001) and PD affection status ({beta}=5.0, p=1.0 x 10-4) in LRRK2 variant carriers. Notably, we observed distinct patterns of heteroplasmic variants that either increased or decreased in heteroplasmy level across passages. Conclusion: PacBio HiFi sequencing, combined with a single-amplicon strategy, enables accurate full-length mtDNA heteroplasmy detection and longitudinal analysis, providing a valuable tool for studying mitochondrial variation and dynamics in disease.

21.
arXiv (CS.AI) 2026-06-16

Beyond Models: Reflections on Engineering AI-enabled Systems in a Project-Based Course

arXiv:2606.16842v1 Announce Type: cross Abstract: Teaching Software Engineering for AI-enabled systems entails addressing the integration of AI components within full-scale software architectures under realistic constraints. While machine learning courses emphasize model development, students often lack experience in architectural design, deployment, and monitoring of AI-enabled systems. Empirical evaluations of such system-oriented AI courses remain limited. This paper reflects on the design and implementation of a project-based master's-level course titled AI Algorithms: Theory and Engineering, at the University of Bremen, in which students developed a movie recommendation system while making architectural design decisions to address challenges related to scalability, deployment, and evolving requirements. We conducted a mixed-methods study combining analyses of student submissions and questionnaire responses to investigate integration challenges, learning outcomes, and opportunities for improvement. Our results indicate persistent difficulties in early architectural decisions, heterogeneous ML integration, evolving requirements, and data management, largely due to uneven ML and software engineering expertise. From the educator's perspective, the course fostered system-level reasoning and strengthened awareness of data-centric ML practices in AI-enabled systems.

22.
arXiv (CS.LG) 2026-06-15

Time Series Causal Discovery via Context-Conditioned and Causality-Augmented Pretraining

arXiv:2605.26759v2 Announce Type: replace Abstract: Causal discovery from time series is critical for many real-world applications, such as tracing the root causes of anomalies. Existing approaches typically rely on dataset-specific optimization, making it difficult to transfer their causal discovery capabilities to new time series governed by diverse causal mechanisms. In this paper, we propose PTCD, a novel Pretraining framework for Time-series Causal Discovery, which improves cross-task generalization through context-conditioned modeling and transferable causal augmentation. To model complex temporal causal dependencies, PTCD employs a dual-scale iterative attention mechanism to capture window-level causal relationships, and a Gaussian mixture with a context-level routing mechanism to handle heterogeneous exogenous distributions. To further address distribution shifts across causal graphs, PTCD adopts a pretraining paradigm on synthetic datasets that integrates intervention-based learning and a causal mixup strategy, promoting stable causal discovery and stronger generalization. Extensive experiments on multiple real-world out-of-distribution (OOD) datasets demonstrate that PTCD excels in both causal discovery and root cause identification.

23.
medRxiv (Medicine) 2026-06-15

Iron deficiency testing among people with incident heart failure in primary care

Background: Given around 50% of people with heart failure have a degree of iron deficiency, guidelines recommend screening. It is uncertain to what extent this is done in primary care and whether testing is equitable. Aim: To report the proportion of people with incident heart failure who undergo a ferritin test within 12 months. Design and setting: Retrospective primary care cohort study using Clinical Practice Research Datalink Aurum data, between 2016 and 2021. Methods: We report the proportion of adults with an incident diagnosis of heart failure who received a ferritin test within 12 months. Multivariable logistic regression was used to examine the odds of testing based on key demographic covariates and co-morbidities. Results: Among 105,749 individuals with an incident diagnosis of heart failure (mean age 71.6 years, SD 14.3), only 35,688 (33.7%) received a ferritin test within the subsequent year. Increasing age (odds ratio 1.25 per 10-year increase, 95% CI: 1.24-1.27), female sex (male sex OR 0.86, 0.84-0.89) and Asian ethnicity (OR 1.70, 1.59-1.80) were all associated with increased odds of testing as were diagnoses of coeliac disease (OR 1.86, 1.58-2.21), type 1 diabetes (OR 1.82, 1.51-2.19) and cirrhosis (OR 1.64, 1.43-1.87). There was geographic variation in testing, even in adjusted analyses. Conclusion: In a large primary care dataset, two thirds of people with incident heart failure did not receive a ferritin test for iron deficiency within a year of diagnosis demonstrating a gap in current practice and an opportunity for improvements in service delivery.

24.
arXiv (CS.AI) 2026-06-11

Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields

arXiv:2606.11042v2 Announce Type: replace Abstract: Recent years have witnessed the rapid evolution of AI agents toward handling increasingly complex, real-world tasks. However, existing benchmarks rarely evaluate whether agents can operate graphical user interfaces to complete long-horizon, high-value professional workflows across diverse domains. Current GUI benchmarks still predominantly focus on general-purpose software, relatively simple applications, and short-horizon tasks, leaving it largely unknown whether modern agents can follow user instructions to autonomously operate domain-specific professional software and accomplish economically valuable work in an end-to-end manner. To bridge this gap, we introduce Workflow-GYM, a benchmark for long-horizon GUI tasks centered on professional domains and specialized software environments. Through extensive experiments on state-of-the-art models, we find that even the strongest models achieve only slightly above 30% success rates, highlighting that professional long-horizon GUI workflows remain highly challenging for current GUI agents. Further analysis reveals that current agents struggle to maintain long-horizon workflow consistency, frequently exhibiting workflow stage omission, error propagation, objective drift, and insufficient understanding of professional software environments. Our findings provide important insights into the limitations of current agent systems and suggest key directions for the next generation of GUI-agent research.

25.
arXiv (quant-ph) 2026-06-11

Quantum thermodynamics, quantum correlations and quantum coherence in accelerating Unruh-DeWitt detectors in both steady and dynamical state

arXiv:2512.18123v2 Announce Type: replace Abstract: We investigate the interplay between quantum thermodynamics, quantum correlations, and quantum coherence within the framework of the Unruh-DeWitt (UdW) detector model. By analyzing both the steady and dynamical states of various quantum resources (including steerability, entanglement, quantum discord, and coherence), we study how these resources evolve under Markovian and non-Markovian environments. Furthermore, we investigate the impact of both the Unruh temperature and the energy levels on three key quantum phenomena: thermodynamic evolution, quantum correlations, and quantum coherence, considering different initial state preparations. The hierarchical structure relating quantum correlations and quantum coherence is determined. We further examine the thermodynamic performance of a quantum heat engine, highlighting the influence of memory effects and classical correlations on heat exchange, work extraction, and efficiency. Our results reveal that non-Markovian dynamics can enhance the preservation of quantum correlations and improve the engine's efficiency compared to purely Markovian regime. These findings provide insights into the role of quantum correlations and quantum coherence in quantum thermodynamic processes and open avenues for optimizing quantum devices operating in relativistic or open-system settings.