论文广场 - AcademicHub

01.

medRxiv (Medicine) 2026-06-22 DOI: HASH:0c945555e86570bf0eb8b4e82f1a0a96

Repeat expansions in Parkinson's disease and parkinsonism across ancestries: insights from a global genetic cohort

作者:

Lange ↗L. M ↗Cerquera-Cleves ↗Tan ↗A.-H ↗Lim ↗S.-Y ↗Okubadejo ↗N. U ↗Lin ↗C.-H ↗Chen ↗…

Expanded short tandem repeats contribute to a broad spectrum of neurodegenerative diseases, yet their roles in Parkinson's disease (PD) and parkinsonism remain incompletely characterized, especially across diverse ancestries. We analyzed short-read whole-genome (WGS) and clinical exome sequencing (CES) data from 38,365 individuals (28,861 WGS; 9,504 CES), encompassing 23,242 patients with PD, 4,729 patients with atypical parkinsonism and 10,394 healthy controls from 11 genetic ancestries. To determine carrier frequencies and characterize repeat structures across diverse ancestries, we genotyped 12 established pathogenic loci where normal, intermediate, and pathogenic alleles can be reliably differentiated using short-read sequencing data. Additionally, we conducted threshold-based associations to determine the minimum threshold associated with increased PD risk in 15,995 individuals (8,591 PD, 7,404 controls) of European ancestry. Pathogenic repeat expansions were detected in 62 patients (56 PD and 6 atypical parkinsonism) and 5 controls across seven loci (AR, ATXN1, ATXN2, ATXN3, CACNA1A, HTT and THAP11), spanning seven ancestries. Among these, ATXN2 expansions were the most frequently observed in PD and were present in African, East Asian, European and Middle Eastern ancestries. Additionally, intermediate ATXN2 repeat expansions exhibited a strong, length-dependent association with PD risk in the European population, with individuals with [≥]32 repeats having a more than four-fold increased risk (odds ratio 4.25, 95% confidence interval 1.80-12.05). Overall, >92% of expanded alleles harbor CAA interruptions within the CAG tract. Pathogenic expansions at other loci, such as ATXN3 and THAP11, showed more ancestry-specific distributions. Clinically, individuals with pathogenic ATXN2 and ATXN3 expansions most often presented with typical PD features but frequently showed earlier disease onset and a strong family history of PD. This large-scale, multi-ancestry study comprehensively maps the genetic landscape of pathogenic and intermediate repeat expansions in PD. Our findings confirm a length- and structure-dependent risk association for ATXN2 with PD in the European population, and highlight the pleiotropic effects of repeat expansions across the parkinsonian spectrum.

阅读与讨论 → 访问原文 →

02.

arXiv (CS.LG) 2026-06-16 DOI: arXiv:2606.17014

Filtered Conformal Ellipsoids for Graph-Native Time Series

作者:

Yannick Limmer ↗

arXiv:2606.17014v1 Announce Type: new Abstract: Joint prediction sets for multivariate time series should control a single event while adapting to cross-coordinate dependence. We study filtered conformal ellipsoids: a frozen state-space filter emits a one-step predictive mean and covariance, and split-conformal calibration is applied to the resulting Mahalanobis scores. The filter is used to choose the ellipsoid shape; conformal calibration chooses the scalar radius, so the construction benefits from a learned predictive covariance without relying on Gaussian tail probabilities for coverage. The main difficulty is that filtered scores are dependent and learned recurrent filters need not contract in their raw hidden state; we therefore analyse contraction in an observable predictive-law quotient that identifies hidden states producing the same future sequence of emitted Gaussian laws. Under a stable Bayes Gaussian-projection filter, covariance bounds, and a finite-horizon observability Fisher condition, small excess Gaussian negative log-likelihood implies contraction of the learned emitted laws. Combined with a threshold-autocovariance envelope this yields a Chebyshev-type approximate coverage bound for filtered split-conformal prediction under dependence; a sharper Bernstein-type bound requires an additional geometric-mixing concentration assumption. Under Gaussian oracle realisability we also obtain a near-oracle log-volume comparison within the class of conditionally valid Gaussian ellipsoid rules. We instantiate the framework with a GCN-GRU filter with diagonal-plus-low-rank covariance. On moderate-size graph-native traffic benchmarks (METRLA-$20$ and PEMSBAY-$50$), the learned filter gives sharper at-target ellipsoids than static-covariance and non-filter baselines; at full-graph scale and on non-graph-native datasets, factor and copula baselines can be stronger.

阅读与讨论 → 访问原文 →

03.

arXiv (CS.CL) 2026-06-18 DOI: arXiv:2505.23851

ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark

作者:

Michael Shalyt ↗Rotem Elimelech ↗Ido Kaminer ↗

Large language models (LLMs) are increasingly applied to symbolic mathematics, yet existing evaluations often conflate pattern memorization with genuine reasoning. To address this gap, we present ASyMOB, a high-resolution dataset of 35,368 validated symbolic math problems spanning integration, limits, differential equations, series, and hypergeometrics. Unlike prior benchmarks, ASyMOB systematically perturbs each seed problem using symbolic, numeric, and equivalence-preserving transformations, enabling a fine-grained assessment of generalization. Our evaluation reveals three key findings: (1) most models' performance collapses under minor perturbations, while top systems exhibit an apparent regime shift in robustness; (2) integrated code tools stabilize performance, particularly for weaker models; and (3) we identify examples where Computer Algebra Systems (CAS) fail while LLMs succeed, as well as problems solved only via a hybrid LLM-CAS approach, highlighting a promising integration frontier. ASyMOB serves as a principled diagnostic tool for measuring and accelerating progress toward building verifiable, trustworthy AI for scientific discovery.

阅读与讨论 → 访问原文 →

04.

arXiv (CS.CV) 2026-06-18 DOI: arXiv:2606.18582

Technical Report for ICRA 2026 GOOSE 2D Fine-Grained Semantic Segmentation Challenge: Leveraging DINOv3 for Robust Outdoor Scene Understanding in Field Robotics

作者:

Jaeil Park ↗Hyobin Choi ↗Sangjin Lee ↗Hyungtae Lim ↗Sung-Hoon Yoon ↗

The GOOSE 2D Fine-Grained Semantic Segmentation Challenge at the ICRA 2026 Workshop on Field Robotics evaluates dense semantic segmentation of off-road imagery over a fine-grained taxonomy of 64 classes and 11 evaluated non-void coarse categories. We present the first-place solution to this challenge. Our solution comprises two complementary improvements: (a) a network-level design that combines a self-supervised DINOv3 ViT-L/16 backbone, a ViT-Adapter, and a Mask2Former mask-classification decoder, together with a coarse-category auxiliary loss on the global [CLS] token; and (b) an inference-time aggregation strategy based on multi-scale and horizontal-flip test-time augmentation and an ensemble of the top three checkpoints selected using Codabench scores. Our method achieves an official composite score of 76.57%, consisting of 69.32% fine-class mIoU and 83.81% category-level mIoU, and ranks first on the final phase leaderboard: www.codabench.org/competitions/14257/#/results-tab.

阅读与讨论 → 访问原文 →

05.

arXiv (CS.LG) 2026-06-12 DOI: arXiv:2606.13252

To GAN or Not To GAN: Segmentation Analysis on Mars DEM

作者:

Douglas Dziedzorm Agbeve ↗Aditya V. Handrale ↗Salim Fares ↗Seif E. Idani ↗

arXiv:2606.13252v1 Announce Type: new Abstract: To better understand Martian Surface, which is needed to enable Rovers navigate Mars with ease, it is necessary to be able to determine the location of mounds. Detecting and studying these morphologies can also help us find evidence of extraterrestrial life, in this case, more specifically, water or signs of life conducive environments. Detection of mounds was done by manually mapping morphological parameters onto Digital Elevation Models. This paper solves the problem by automatically detecting and or predicting mounds on Mars using Neural Network based Semantic Segmentation methodologies. This is done by using supervised semantic segmentation model and generative adversarial approach. A comparison of the approaches shows that adding extra artificially generated data did not improve the result.

阅读与讨论 → 访问原文 →

06.

arXiv (quant-ph) 2026-06-16 DOI: arXiv:2605.25398

Boson Sampling as a Probe of Chaotic and Integrable Quantum Dynamics in a Photonic Chip

作者:

Yuancheng Zhan ↗Khen Cohen ↗Norman T. W. Koo ↗Kian Hwee Lim ↗Hui Zhang ↗Lingxiao Wan ↗Sanghoon Chae ↗Ai Qun Liu ↗Victor M Bastidas ↗Yaron Oz ↗Leong-Chuan Kwek ↗

arXiv:2605.25398v2 Announce Type: replace Abstract: Quantum chaos plays a key role in understanding complex quantum dynamics, while integrated photonics offers unique advantages for quantum applications, including high-speed operation, scalability, and programmable unitary transformations. However, integrated photonic approaches to probing quantum chaos remain largely unexplored, owing to the absence of a clear connection between programmable photonic dynamics and established chaos diagnostics. In this work, we establish Fock-state boson sampling as a practical probe of quantum chaos by exploiting the sensitivity of multiphoton interference to the random-matrix properties of underlying single-particle unitary dynamics. More importantly, we design and fabricate a programmable quantum photonic chip to experimentally implement this framework, achieving the first integrated-photonic demonstration of quantum-chaos probes based on boson sampling. Experimental results show that the three complementary probes proposed in this work, namely the distance to Porter–Thomas statistics, Shannon entropy, and Out-of-Time-Ordered-Correlator-equivalent observables, exhibit close agreement with theoretical predictions and consistently distinguish chaotic and integrable dynamics. Our work provides a scalable route for investigating complex quantum dynamics on programmable photonic platforms while leveraging the intrinsic advantages of boson sampling through multiphoton interference and complex output statistics.

阅读与讨论 → 访问原文 →

07.

arXiv (CS.LG) 2026-06-16 DOI: arXiv:2606.15682

ReQAT: Achieving Full-Precision Reasoning Accuracy with 4-bit Floating-Point Quantization-Aware Training

作者:

Janghwan Lee ↗Sihwa Lee ↗Jinseok Kim ↗Yongjik Kim ↗Jieun Lim ↗Jinwook Oh ↗Jungwook Choi ↗

arXiv:2606.15682v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) achieve strong problem-solving through long chain-of-thought, but their deployment is constrained by the high cost of full-precision inference and growing KV cache footprints. Microscaled FP4 formats enable efficient FP4 deployment; however, fully quantizing weights, activations, and KV caches (W4A4KV4) causes severe reasoning degradation that existing PTQ and QAT fail to recover. We identify that FP4 failures concentrate on low-entropy tokens–precise symbolic commitments such as digits and operators–where quantization noise inflates sampling errors that cascade through reasoning traces. Based on this insight, we propose ReQAT, a reasoning-centric FP4 training framework with three components: (i) Trace-Aligned QAT (TAQ), which revisits identical reasoning traces to focus updates on critical low-entropy decisions; (ii) Selective Entropy Minimization (SEM), which reinforces confidence at low-entropy positions; and (iii) Q-FIT, a quantization-friendly initialization that jointly calibrates RoPE-consistent KV cache transformations to stabilize QAT. Under the same training budget, ReQAT not only recovers but surpasses BF16 fine-tuning accuracy, while delivering up to 3.9x throughput speedup on NVIDIA DGX Spark and 3.1x on B200.

阅读与讨论 → 访问原文 →

08.

arXiv (CS.CL) 2026-06-16 DOI: arXiv:2606.16368

Evaluating LLM Personalization via Semantic Constraint Verification

作者:

Xuran Li ↗Guanqin Zhang ↗Imran Razzak ↗Hakim Hacid ↗Eleanna Kafeza ↗Hao Xue ↗Flora D. Salim ↗

Current evaluation paradigms for Large Language Model (LLM) personalization rely heavily on brittle surface-matching metrics or computationally expensive LLM-as-a-judge protocols, both of which lack interpretability. To address these limitations, we introduce Natural Language Inference Constraint Verification (NLICV), a scalable, semantically invariant framework that maps sentence meanings to truth-condition sets to verify personalization constraints via a Natural Language Inference (NLI) model. Moving beyond binary scoring, NLICV categorizes LLM behaviors into four distinct modes: personalization, generalization, sycophancy, and failure. Extensive experiments demonstrate that NLICV aligns closely with human annotations while drastically reducing the latency and token costs associated with LLM judges (up to 2100 inference speedup). Finally, through an ablation-based procedure, NLICV pinpoints the exact sentences driving the constraint verification, yielding faithful, understandable evidence for its evaluations.

阅读与讨论 → 访问原文 →

09.

arXiv (CS.AI) 2026-06-17 DOI: arXiv:2603.22372

Rethinking Multimodal Fusion for Time Series: Text Modalities Need Constrained Fusion

作者:

Seunghan Lee ↗Jun Seo ↗Jaehoon Lee ↗Sungdong Yoo ↗Minjae Kim ↗Tae Yoon Lim ↗Dongwan Kang ↗Hwanil Choi ↗SoonYoung Lee ↗Wonbin Ahn ↗

arXiv:2603.22372v2 Announce Type: replace-cross Abstract: Recent advances in multimodal learning have motivated the integration of auxiliary modalities such as text or vision into time series (TS) forecasting. However, most existing methods provide limited gains, often improving performance only in specific datasets or relying on architecture-specific designs that limit generalization. In this paper, we show that multimodal models with naive fusion strategies (e.g., simple addition or concatenation) often underperform unimodal TS models, which we attribute to the uncontrolled integration of auxiliary modalities which may introduce irrelevant information. Motivated by this observation, we explore various constrained fusion methods designed to control such integration and find that they consistently outperform naive fusion methods. Furthermore, we propose Controlled Fusion Adapter (CFA), a simple plug-in method that enables controlled cross-modal interactions without modifying the TS backbone, integrating only relevant textual information aligned with TS dynamics. CFA employs low rank adapters to filter irrelevant textual information before fusing it into temporal representations. We conduct over 20K experiments across various datasets and TS/text models, demonstrating the effectiveness of the constrained fusion methods. Code is available at: https://github.com/seunghan96/cfa.

阅读与讨论 → 访问原文 →

10.

arXiv (quant-ph) 2026-06-19 DOI: arXiv:2606.20380

Discrimination of genuinely nonlocal sets without entanglement in multipartite systems

作者:

Ziying Hou ↗Huaqi Zhou ↗Limin Gao ↗

arXiv:2606.20380v1 Announce Type: new Abstract: Genuine nonlocality arises when a set of multipartite orthogonal states is locally indistinguishable under any bipartition of the subsystems. The entanglement-assisted discrimination of such genuinely nonlocal orthogonal product sets has attracted significant attention in quantum information. Based on the criterion of local irreducibility, genuine nonlocality is classified into Type I (reducible) and Type II (irreducible). We present entanglement-assisted discrimination schemes for both types of genuinely nonlocal sets that use minimal resources. For low-dimensional cases, Type I sets require only a single EPR pair, whereas Type II sets necessitate only one GHZ state. We extend these protocols to higher-dimensional systems: the discrimination of Type I sets requires only one maximally entangled state in a two-qutrit system, while that of Type II sets similarly demands a single maximally entangled state in a three-qutrit system. For $n$-partite ($n > 3$) systems, Type I sets continue to require only one maximally entangled state, whereas Type II sets necessitate just one additional EPR pair compared to their Type I counterparts. These results provide a robust framework for the efficient discrimination of genuinely nonlocal sets using minimal quantum resources.

阅读与讨论 → 访问原文 →

11.

arXiv (math.PR) 2026-06-24 DOI: arXiv:2508.12103

Sub-Poisson distributions: Concentration inequalities, optimal variance proxies, and closure properties

作者:

Lasse Leskel\"a ↗Ian V\"alimaa ↗

arXiv:2508.12103v2 Announce Type: replace Abstract: We introduce a nonasymptotic framework for sub-Poisson distributions with moment generating function dominated by that of a Poisson distribution. At its core is a new notion of optimal sub-Poisson variance proxy, analogous to the variance parameter in the sub-Gaussian setting. This framework allows us to derive a Bennett-type concentration inequality without boundedness assumptions and to show that the sub-Poisson property is closed under key operations including independent sums and convex combinations, but not under all linear operations such as scalar multiplication. We derive bounds relating the sub-Poisson variance proxy to sub-Gaussian and sub-exponential Orlicz norms. Taken together, these results unify the treatment of Bernoulli and Poisson random variables and their signed versions in their natural tail regime.

阅读与讨论 → 访问原文 →

12.

arXiv (CS.CL) 2026-06-16 DOI: arXiv:2605.01101

Virtual Speech Therapist: A Clinician-in-the-Loop AI Speech Therapy Agent for Personalized and Supervised Therapy

作者:

Shakeel Sheikh ↗Patrick Marmaroli ↗MD Sahidullah ↗Slim Ouni ↗Fabrice Hirsch ↗Goncalo Leal ↗Bjorn W Schuller ↗

This paper develops Virtual Speech Therapist (VST), an intelligent agent-based platform that streamlines stuttering assessment and delivers customized therapy planning through automated and adaptive AI-driven workflows. VST integrates state-of-the-art deep learning-based stuttering classification, and multi-agent large language model (LLM) reasoning to support evidence-based clinical decision-making. The VST begins with the acquisition and feature extraction of patient speech samples, followed by robust classification of stuttering types. Building on these outputs, VST initiates an agentic reasoning process in which specialized LLM agents autonomously generate, critique, and iteratively refine individualized therapy plans. A dedicated critic agent evaluates all generated therapy plans to ensure clinical safety, methodological soundness, and alignment with peer-reviewed evidence and established professional guidelines. The resulting output is a comprehensive, patient-specific therapy draft intended for clinician review. Incorporating clinician feedback, the system then produces a finalized therapy plan suitable for patient delivery, thereby maintaining a clinician-in-the-loop paradigm. Experimental evaluation by expert speech therapists confirms that VST consistently generates high-quality, evidence-based therapy recommendations. These findings demonstrate the system's potential to augment clinical workflows, reduce clinician burden, and improve therapeutic outcomes for individuals with speech impairments. An interactive user interface for the proposed system is available online at: https://vocametrix.com/ai/stuttering-therapy-planning-agent , facilitating real-time stuttering assessment and personalized therapy planning.

阅读与讨论 → 访问原文 →

13.

arXiv (CS.AI) 2026-06-19 DOI: arXiv:2606.19887

FFinRED: An Expert-Guided Benchmark Generation and Evaluation Framework for Financial LLM Red-Teaming

作者:

Chaeyun Kim ↗Daeyoung Park ↗Junghwan Kim ↗Jinyoung Jeong ↗Eunji Song ↗Yongtaek Lim ↗Minwoo Kim ↗

arXiv:2606.19887v1 Announce Type: cross Abstract: Existing safety benchmarks target general adversarial scenarios but miss finance-specific risks. Financial LLMs face regulatory compliance violations, fraud facilitation, and systemic trust erosion that require targeted evaluation. We introduce FinRED, an expert-guided red-teaming framework for financial LLM safety evaluation developed with financial experts. FinRED uses a novel two-level taxonomy mapping global standards (e.g., FATF and EU DORA) to threats ranging from regulatory evasion to complex fraud, integrated with a scalable pipeline that converts real financial documents into context-rich red-teaming Behavioral Prompts (seeds) through an expert-defined schema. Rigorous expert validation confirms seed plausibility and realism for meaningful LLM safety evaluation. We also provide an expert-validated, finance-specific rubric that goes beyond disclaimer checks, aligns more closely with human experts than static one-size-fits-all rubrics, and reduces critical false negatives from 28 to 12. Aligned with internationally adopted risk-management and information-security standards (e.g., ISO/IEC 27001), FinRED is deployed in South Korea's Financial Security Institute (FSI) regulatory sandbox for generative AI security evaluation in real financial services. To mitigate dual-use risks, the dataset, generation pipeline, prompt template, and evaluation framework are gated for qualified researchers at https://github.com/selectstar-ai/FinRED-paper and https://huggingface.co/datasets/datumo/FinRED.

阅读与讨论 → 访问原文 →

14.

arXiv (math.PR) 2026-06-24 DOI: arXiv:2510.13648

Near-critical Ornstein–Zernike theory for the planar random-cluster model

作者:

Lucas D'Alimonte ↗Ioan Manolescu ↗

arXiv:2510.13648v3 Announce Type: replace Abstract: We develop an Ornstein–Zernike theory for the two-dimensional random-cluster model with $1 \leq q

阅读与讨论 → 访问原文 →

15.

arXiv (CS.AI) 2026-06-12 DOI: arXiv:2605.03460

FinSTaR: Towards Financial Reasoning with Time Series Reasoning Models

作者:

Seunghan Lee ↗Jun Seo ↗Jaehoon Lee ↗Sungdong Yoo ↗Minjae Kim ↗Tae Yoon Lim ↗Dongwan Kang ↗Hwanil Choi ↗Soonyoung Lee ↗Wonbin Ahn ↗

arXiv:2605.03460v3 Announce Type: replace Abstract: Time series (TS) reasoning models (TSRMs) have shown promising capabilities in general domains, yet they consistently fail in the financial domain, which exhibits unique characteristics. We propose a general 2 x 2 capability taxonomy for TSRMs by crossing 1) single-entity vs. multi-entity analysis with 2) assessment of the current state vs. prediction of future behavior. We instantiate this taxonomy in the financial domain-where the distinction between deterministic assessment and stochastic prediction is particularly critical-as ten financial reasoning tasks, forming the FinTSR-Bench benchmark based on S&P stocks. To this end, we propose FinSTaR (Financial Time Series Thinking and Reasoning), trained on FinTSR-Bench with distinct chain-of-thought (CoT) strategies tailored to each category. For assessment, which is deterministic (i.e., computable from observable data), we employ Compute-in-CoT, a programmatic CoT that enables models to derive answers directly from raw prices. For prediction, which is inherently stochastic (i.e., subject to unobservable factors), we adopt Scenario-Aware CoT, which generates diverse scenarios before making a judgment, mirroring how financial analysts reason under uncertainty. The proposed method achieves 78.9% average accuracy on FinTSR-Bench, substantially outperforming LLM and TSRM baselines. Furthermore, we show that the four capability categories are complementary and mutually reinforcing through joint training, and that Scenario-Aware CoT consistently improves prediction accuracy over standard CoT. Code is available at https://github.com/seunghan96/FinSTaR.

阅读与讨论 → 访问原文 →

16.

arXiv (CS.CV) 2026-06-24 DOI: arXiv:2511.18037

Hybrid Event Frame Sensors: Modeling, Calibration, and Simulation

作者:

Yunfan Lu ↗Nico Messikommer ↗Xiaogang Xu ↗Liming Chen ↗Yuhan Chen ↗Nikola Zubic ↗Davide Scaramuzza ↗Hui Xiong ↗

Hybrid event-frame sensors integrate an Event Vision Sensor (EVS) and an Active Pixel Sensor (APS) within a single chip, combining the high dynamic range and low latency of the EVS with the rich spatial intensity information from the APS. While this tight integration offers compact and temporally precise imaging, the complex circuit architecture introduces nontrivial noise patterns that remain poorly understood and unmodeled. In this work, we present the first unified statistics-based imaging noise model that jointly describes the noise behavior of APS and EVS pixels. Our formulation explicitly incorporates photon shot noise, dark current noise, fixed-pattern noise, and quantization noise, and links EVS noise to illumination level and dark current. Based on this formulation, we further develop a calibration pipeline to estimate noise parameters from real data and provide a detailed analysis of both APS and EVS noise behaviors. Finally, we propose H-ESIM, a statistically grounded simulator that generates RAW frames and events under realistic jointly calibrated noise statistics. Experiments on two hybrid sensors validate our model across multiple imaging tasks, including video frame interpolation and deblurring, demonstrating strong transfer from simulation to real data.

阅读与讨论 → 访问原文 →

17.

bioRxiv (Bioinfo) 2026-06-24 DOI: HASH:bf9d395dd21927d72978ee2ce96aa893

Systematic benchmarking of multi-modal approaches for tumor-naive ctDNA detection and quantification

作者:

Qi ↗Odinokov ↗Lakshmanan ↗L. N ↗Grachet ↗N. G ↗Lou ↗Saelee ↗Garcia-Montoya ↗Mun ↗W. P ↗Rahman ↗…

Longitudinal monitoring of circulating tumor DNA (ctDNA) has emerged as a promising framework for characterizing treatment response dynamics in cancer. Scalable tumor-naive approaches for quantifying ctDNA often involve whole-genome sequencing (WGS) or DNA methylation profiling, but their comparative performance and capacity for complementary integration remain poorly understood. Here we systematically benchmarked tumor-naive WGS- and methylation-based ctDNA quantification methods using plasma from 150 patients with colorectal, lung and breast cancer. Using paired high-depth WGS and EM-seq data, we generated 40,000 in silico samples and evaluated detection accuracy, limits of detection (LoD) and quantification (LoQ) across cancer types and sequencing depths (0.1x-30x). We further assessed single- and multimodal method combinations, identifying conditions under which integrated approaches enhance analytical performance for detection and quantification relative to single modalities. This benchmark delineates key performance trade-offs and provides a practical framework to support method development and guide future research applications in ctDNA-based biomarker studies.

阅读与讨论 → 访问原文 →

18.

arXiv (CS.CL) 2026-06-25 DOI: arXiv:2606.26015

The Tatoxa System for Text Detoxification in Low-Resource Languages: The Case of Tatar

作者:

Ilseyar Alimova ↗Bogdan Monogov ↗Artyom Mazur ↗Daniil Antonov ↗Vsevolod Karimov ↗Vitaliy Egorov ↗Bulat Khakimov ↗Alexander Panchenko ↗

Text detoxification, the automated detection and mitigation of abusive and harmful content, is essential for ensuring the safety of online communities and protecting users. However, low resource languages such as Tatar have received little research attention. In this paper we present Tatoxa, a novel state-of-the-art system for text detoxification in the Tatar language. Comparative experiments show that the proposed approach outperforms existing open source and proprietary commercial LLMs on key quality metrics. We also introduce a new dataset for text detoxification in Tatar, designed for fine tuning and evaluation in low resource settings. Finally, cross lingual transfer experiments indicate that transfer from other languages, including the culturally close Russian, performs significantly worse than training on native Tatar data even when a large Russian corpus is available.

阅读与讨论 → 访问原文 →

19.

arXiv (CS.AI) 2026-06-17 DOI: arXiv:2606.17507

LLM-as-Judge in Education: A Curriculum-Grounded Marking Pipeline

作者:

Xiwei Xu ↗Chen Wang ↗Jacky Jiang ↗Phil Yang ↗Qian Fu ↗Mohan Dhall ↗Wenjie Zhang ↗Liming Zhu ↗

arXiv:2606.17507v1 Announce Type: new Abstract: Generative AI and large language models (LLMs) are increasingly applied to question generation and automated assessment. However, deploying LLMs in preparation for high-stakes exams requires more than prompt engineering; it demands software pipelines that systematically ground model outputs in authorised curriculum artefacts and marking guidelines issued by education authorities. This paper presents a curriculum-grounded, configurable LLM-as-Judge pipeline for question-level marking, co-developed with an industrial partner, to support exam preparation for university admission. The pipeline identifies the relevant topics, subtopics, and cognitive demand of a question, and assembles verifiable and authorised context to support LLM judgement. Curriculum intent is operationalised through concrete syllabus artefacts, including prescribed verbs and outcomes, performance band descriptors, glossary definitions, and marking-guideline principles. A staged LLM workflow is employed to first generate question-specific rubrics, capturing structured expectations of performance, and then derive and evaluate marking criteria used to allocate marks to student responses. This design improves consistency, transparency, and alignment with official marking practices. Preliminary evaluation shows that the proposed LLM-as-Judge pipeline delivers marking outcomes comparable to human tutors, while yielding justifications that are more traceable to authorised curriculum artefacts and marking standards. The pipeline has also been integrated into an online study platform, where early deployment data provide initial insights into operational usage and manual overrides.

阅读与讨论 → 访问原文 →

20.

arXiv (CS.CL) 2026-06-16 DOI: arXiv:2606.16497

daVinci-kernel: Co-Evolving Skill Selection, Summarization, and Utilization via RL for GPU Kernel Optimization

作者:

Dayuan Fu ↗Mohan Jiang ↗Tongyu Wang ↗Dian Yang ↗Jiarui Hu ↗Liming Liu ↗Jinlong Hou ↗Pengfei Li ↗

GPU kernel optimization represents a paradigm where functional correctness is assumed and execution efficiency is the objective. We present daVinci-kernel, a reinforcement learning framework that couples skill discovery with skill exploitation through a dynamically evolving skill library. daVinci-kernel jointly trains three agents sharing one LLM backbone: a Skill Selection Agent that retrieves relevant techniques via BM25 and LLM reranking, a Policy Agent that generates multi-turn CUDA/Triton kernels conditioned on selected skills, and a Skill Summary Agent that distills successful rollouts into reusable skills. Candidate skills are added only after execution-based verification confirms reproducible speedups. All three agents share a single LLM backbone, are initialized via a structured SFT cold start on diversity-filtered data, and are then jointly optimized end-to-end with multi-turn REINFORCE and per-agent advantage estimation. On KernelBench, daVinci-kernel-14B achieves 37.2%, 70.6%, and 32.2% on Level 1, Level 2, and Level 3 under the Fast$_1$ threshold, outperforming the strongest prior RL-trained model, Dr.Kernel-14B.

阅读与讨论 → 访问原文 →

21.

medRxiv (Medicine) 2026-06-15 DOI: HASH:b985305732506e9d030554556de629a8

Specialty Choice Attitudes Among Medical Interns: Evidence from Hormozgan University of Medical Sciences

作者:

Kashefi Sis ↗Shendabadi ↗Alimi ↗Boushehri ↗

Background: Choosing a medical specialty is a critical career decision that affects both physicians future professional lives and the composition of the healthcare workforce. Specialty preferences are shaped by multiple personal, educational, and socioeconomic factors, yet evidence from senior medical students in southern Iran remains limited. This study aimed to assess willingness to pursue specialty training among medical interns at Hormozgan University of Medical Sciences, identify their preferred specialties, and examine factors associated with their decisions. Methods: This descriptive-analytical cross-sectional study was conducted in 2023 among medical interns at Hormozgan University of Medical Sciences in Bandar Abbas, Iran. Using a convenience census approach, all eligible interns were invited to participate, and 83 students completed an online questionnaire. The instrument collected demographic, academic, and occupational data, as well as reasons for willingness or unwillingness to pursue specialty training and specialty preferences. Content and face validity were assessed by faculty members and students, and internal consistency reliability in the present study was acceptable (Cronbach alpha = 0.82). Data were analyzed using descriptive statistics and logistic regression in SPSS version 27. Results: Of the 83 participants, 50 (60.2%) reported willingness to pursue specialty training, while 33 (39.8%) did not. Among students willing to continue, the most frequently cited reasons were achieving a better economic position, broader job opportunities, and higher social status. Among those unwilling to continue, the most common reasons were fatigue from prolonged studying, financial problems, and the desire to start working after graduation. Radiology was the most common first-choice specialty, followed by otorhinolaryngology, dermatology, and cardiology. In regression analyses, no demographic or academic variable remained independently associated with willingness to pursue specialty training in the final multivariable model. Conclusions: A majority of medical interns were interested in pursuing specialty training, with preferences concentrated in a limited number of specialties perceived as offering favorable financial prospects, prestige, and lifestyle. Economic concerns and educational fatigue were the dominant factors influencing willingness and unwillingness to continue specialty education. These findings highlight the need for structured career counseling, broader exposure to different specialties, and policy measures to address financial and structural barriers to residency training. Keywords: medical specialty choice; medical interns; residency training; medical education; Hormozgan university of medical sciences

阅读与讨论 → 访问原文 →

22.

arXiv (quant-ph) 2026-06-25 DOI: arXiv:2606.25053

A Contactless Heat Engine Driven by Nonreciprocal Fluctuation-Induced Torques

作者:

Dhruv Shah ↗Kiryl Asheichyk ↗David Gelbwaser-Klimovsky ↗Noah Graham ↗Mehran Kardar ↗Matthias Kr\"uger ↗

arXiv:2606.25053v1 Announce Type: new Abstract: We describe a contactless heat engine in which quantum and thermal electromagnetic fluctuations act as the working medium. The setup consists of two concentric cylinders held at different temperatures. The inner cylinder stably levitates within the outer one due to repulsive nonequilibrium Casimir forces. The chirality of the setup is broken by using nonreciprocal dielectric materials, akin to application of a magnetic field along the common cylinder axis. Using Rytov fluctuational electrodynamics, we show that heat transfer and torque can be expressed in terms of an angular-momentum-resolved heat flux density, $\Phi_n(\omega)$: each exchanged photon carries energy $\hbar \omega$ and angular momentum $\hbar n$. In reciprocal media contributions from modes $n$ and $-n$ cancel and there is no net torque; nonreciprocity breaks this symmetry and powers rotation of the inner cylinder. Even in the absence of contact, electromagnetic fluctuations produce a frictional torque opposing rotation that we compute. This enables computation of characteristic steady state rotations, and estimation of the engine efficiency (which remains bounded by the Carnot limit). The cylindrical setup provides a natural realization of fluctuation-induced angular-momentum transfer and a possible route toward nanoscale contactless engines.

阅读与讨论 → 访问原文 →

23.

arXiv (CS.CV) 2026-06-18 DOI: arXiv:2204.14224

Investigation of Neural Network Methods for Reconstruction and Classification of Texture Images Under Conditions of Incomplete Information

作者:

Galymzhan Abdimanap ↗Kairat Bostanbekov ↗Abdelrahman Abdallah ↗Anel Alimova ↗Darkhan Kurmangaliyev ↗Daniyar Nurseitov ↗Tatyana Dedova ↗Larissa Balakay ↗Serik Nurakynov ↗

The automated analysis of heterogeneous natural textures is frequently hindered by physical damage and data loss, presenting a significant challenge to computer vision. While deep learning has shown success in controlled environments, its application to complex geological materials under conditions of incomplete information remains underexplored. This study presents an integrated framework for the inpainting and classification of high-resolution core sample images. We propose an end-to-end pipeline that utilizes object detection for sample segmentation, followed by image inpainting using Generative Adversarial Networks (GANs) with Contextual Residual Aggregation (CRA) to reconstruct missing high-frequency details. Subsequently, we evaluate the performance of modern Transformer-based (Swin, ViT) and CNN architectures on the reconstructed data. Our experiments revealed a critical divergence between reconstruction quality and downstream utility: despite high structural fidelity (PSNR 28.7~dB, FID 74.01), classification accuracy plateaued at 53\%. To improve minority-class detection, we propose a confidence-based hybrid ensemble that raises MCA from 48\% to 58\%. These results highlight the limitations of current state-of-the-art generative models, which may produce visually plausible but semantically ambiguous features ("hallucinations") that confound classifiers. This work provides insights into the dependencies between image reconstruction quality and classification performance, offering a reproducible baseline for future research in non-destructive testing and material science. Given that cross-well accuracy remains in the 49–53\% range, we position the resulting system as a decision-support and screening tool for lithofacies interpretation rather than as a fully autonomous classifier. The code is available at https://github.com/GalymzhanAbdimanap/Lithology_recognition

阅读与讨论 → 访问原文 →

24.

arXiv (CS.CV) 2026-06-11 DOI: arXiv:2606.12195

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

作者:

Ziang Yan ↗Sheng Xia ↗Jiashuo Yu ↗Yue Wu ↗Tianxiang Jiang ↗Songze Li ↗Kanghui Tian ↗Yicheng Xu ↗Yinan He ↗Kai Chen ↗Limin Wang ↗Yu Qiao ↗…

Recent progress in foundation models has shifted toward agentic behavior involving multi-step reasoning and tool use. However, open-source efforts largely focus on text-dominant settings, leaving long-horizon multimodal tasks underexplored. This gap is evident in video tasks requiring sustained temporal understanding and iterative interaction. We present InternVideo3, a framework enhancing these capabilities via Multimodal Contextual Reasoning (MCR). MCR treats understanding as a closed-loop process over a shared, evolving context containing observations, instructions, reasoning, tool actions, and memory. This frames long-video understanding as evidence accumulation and verification. To ensure efficiency, we introduce Multimodal Multi-head Latent Attention (M^2LA), a token-preserving reparameterization compressing KV-cache states while retaining the full token stream. Our staged training includes continued pretraining, short-to-long supervised fine-tuning, rule-based reinforcement learning, and on-policy distillation. Experiments show InternVideo3 achieves strong performance on benchmarks like Video-MME, MLVU, and EgoSchema. We further instantiate the model as a video agent with retrieval tools, demonstrating robust evidence-grounded behavior. Our results suggest that efficient context handling and closed-loop reasoning are vital for adapting open multimodal models toward long-horizon visually grounded agency.

阅读与讨论 → 访问原文 →

25.

arXiv (CS.AI) 2026-06-16 DOI: arXiv:2606.16160

A comparative and critical study of EEGNet for fNIRS-driven cognitive load classification

作者:

Mehshan Ahmed Khan ↗Houshyar Asadi ↗Li Zhang ↗Mohammad reza Chalak Qazani ↗Ghazal Bargshady ↗Stefanos gkikas ↗Christian arzate ↗Sam Oladazimi ↗Zoran Najdovsk ↗Lei Wei ↗Chee Peng Lim ↗

arXiv:2606.16160v1 Announce Type: cross Abstract: Accurately classifying cognitive load from functional near-infrared spectroscopy (fNIRS) signals remains a significant challenge due to temporal variability, inter-subject differences, and sensitivity to preprocessing choices. This study provides a comprehensive evaluation of EEGNet for fNIRS-based cognitive load classification by systematically examining the effects of temporal segmentation strategies (overlapping vs. non-overlapping), window lengths (10s, 20s, 30s), feature extraction methods (Analysis of Variance (ANOVA), Principal Component Analysis (PCA), Fast Independent Component Analysis (FastICA)), learning rate configurations (fixed and adaptive), and evaluation protocols (random split vs. subject-independent (SI)). Results from random-split experiments show that overlapping segmentation, combined with smaller fixed learning rates (0.01-0.001), yields the highest accuracies, due to temporal redundancy and dense sampling of hemodynamic transitions. However, SI evaluation reveals a substantial drop in accuracy, demonstrating limited generalization to unseen participants. Under SI evaluation, non-overlapping segmentation outperformed overlapping windows, with the best accuracy of 56.11% achieved using PCA features with a 20-second window and a 0.1 learning rate. These findings indicate that eliminating temporal redundancy helps the model learn more robust and generalizable representations of cognitive load across individuals. Although adaptive learning rate strategy improved training stability, it did not surpass the performance of optimally selected fixed learning rates. The study highlights the critical role of segmentation strategy and learning rate selection in improving model generalization and identifies methodological considerations essential for developing reliable, real-time, and SI cognitive load classification systems using fNIRS.

阅读与讨论 → 访问原文 →

探索全球前沿学术脉络