Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CV) 2026-06-17

SPATIA: Multimodal Generation and Prediction of Spatial Cell Phenotypes

Understanding how cellular morphology, gene expression, and spatial context jointly shape tissue function is a central challenge in biology. Image-based spatial transcriptomics technologies now provide high-resolution measurements of cell images and gene expression profiles, but existing methods typically analyze these modalities in isolation or at limited resolution. We address the problem by introducing SPATIA, a multi-level generative and predictive model that learns unified, spatially aware representations by fusing morphology, gene expression, and spatial context from the cell to the tissue level. SPATIA also incorporates a spatially conditioned generative framework with confidence-aware OT reweighting and morphology-profile alignment for modeling target-state morphology distributions. Specifically, we propose a confidence-aware flow matching objective that reweights weak optimal-transport pairs based on uncertainty. We further apply morphology-profile alignment to encourage biologically meaningful image generation, enabling the modeling of microenvironment-dependent phenotypic transitions. We assembled a multi-scale dataset consisting of 25.9 million cell-gene pairs across 17 tissues. We benchmark SPATIA against 18 models across 12 tasks, spanning categories such as phenotype generation, annotation, clustering, gene imputation, and cross-modal prediction. SPATIA achieves improved performance over state-of-the-art models, improving generative fidelity by 8% and predictive accuracy by up to 3%.

02.
medRxiv (Medicine) 2026-06-12

Integrative Mechanisms of Early Clinical and Research Training (ECART) in Orthopaedic Medical Education: A Qualitative Single-Case Study

Background: Early clinical exposure and student participation in research are important components of medical training. They may support learning motivation, evidence literacy, and self-directed learning. In many programmes, however, clinical training and research training remain separated. Few studies have explained, within a real teaching team, how learners turn clinical phenomena into researchable questions and how research participation can reshape their clinical understanding. Early Clinical and Research Training (ECART) is a clinical-research integration approach developed by an orthopaedic team at the Second Hospital of Shandong University. Methods: We conducted a theory-informed, interpretivist qualitative single-case study. The case was an orthopaedic clinical-research team at the Second Hospital of Shandong University. Participants included medical undergraduates, academic degree graduate students, professional degree graduate students, clinical teachers, and research platform leads. We used purposive sampling with maximum variation. Data were collected through semi-structured interviews and de-identified teaching documents. Data were analysed using the framework method and were interpreted with a Context-Activity-Mechanism-Outcome (CAMO) logic. Results: The analysis showed that ECART was not simply early entry into the clinic or early entry into the laboratory. It was a team-based learning process centred on real medical problems. Four themes were identified. First, early clinical exposure helped learners make real problems visible and nameable, rather than merely increasing exposure. Second, clinical-research connection followed different pathways. Professional degree graduate students often started from clinical uncertainties in residency training and case management, and moved toward evidence-informed small projects. Academic degree graduate students often started from literature gaps, experimental findings, and mechanistic hypotheses, and then used clinical feedback to calibrate meaning. Third, research training, through literature reading, group meetings, experimental design, data review, and mentor questioning, helped learners move from completing tasks to explaining problems. Fourth, sustained ECART depended on a tiered team ecology formed by clinical teachers, research mentors, research platforms, and senior peers. Based on these findings, we refined the ECART programme theory: real medical problems are translated through explanation, searching, experimentalisation, and feedback-based reinterpretation into research questions that learners can understand, discuss, and test. This process supports problem formation, evidence awareness, mechanistic reasoning, translational judgement, and career clarification. Conclusion: ECART is best understood as a clinical-research integrated learning ecology that emerges from real team practice, rather than as a fixed standardised course. Its educational value lies in a recurring cycle of real problems, research translation, multi-source feedback, and clinical reinterpretation. This framework may inform the design, evaluation, and contextual adaptation of clinical-research integration pathways in medical education.

03.
medRxiv (Medicine) 2026-06-15

Scalable estimation of temporal clustering in accelerometry: a kernel-independent dispersion index grounded in the Hawkes process

Background. Self-exciting (Hawkes) point processes are a natural model for the temporal clustering of human physical activity (PA) recorded by accelerometers, yet they have seldom been used in this setting—in part because the usual maximum-likelihood fitting is challenging due to potential estimation bias and convergence failures on these data. A moment-based alternative—estimating the Hawkes branching ratio from the dispersion index, the variance-to-mean ratio of event counts—is kernel-independent and computationally trivial, but it has not been evaluated for accelerometry or adapted to the intensity-marked recordings accelerometers provide. Methods. Treating each minute above a sedentary threshold as an event, we estimated the Hawkes branching ratio $n$ by maximum likelihood and, as a kernel-independent and far cheaper alternative, from the dispersion index. We compared four dispersion-based estimators—event-count-based, intensity-mark-weighted using the mark-moment ratio, and time-of-day (TOD) adjusted variants of each—against the marked and unmarked maximum-likelihood estimates. Estimators were evaluated for mutual agreement, goodness of fit, and finite-window results in two National Health and Nutrition Examination Survey (NHANES) accelerometry cohorts (hip-worn, $n=2{,}560$; wrist-worn, $n=3{,}132$). We related the resulting temporal clustering measures to all-cause mortality using survey-weighted Cox models, adjusting for PA frequency, Peak30 (the average of the 30 highest PA values), and demographic covariates. Results. Event-count-based dispersion estimates agreed strongly with maximum-likelihood branching ratios ($rapprox0.74$ in both cohorts); the intensity-marked variant incorporating PA intensity variability agreed less well. Marked and unmarked Hawkes models yielded similar excitation and decay parameters, suggesting PA intensity added little clustering information beyond event timing. In the survival analysis, temporal clustering was associated with all-cause mortality independently of PA frequency and Peak30; the direction of association differed between the hip- and wrist-worn cohorts. Conclusions. A scalable dispersion-index estimator recovers the Hawkes branching ratio and matches maximum-likelihood estimates without requiring kernel specification or iterative optimization. It offers a practical tool for quantifying temporal clustering in accelerometry, enabling decomposition of temporal PA patterns into its exogenous initiation and endogenous persistence. Such temporal patterns carry health-relevant information beyond PA intensity and volume. Keywords: dispersion index; Hawkes process; branching ratio; temporal clustering; point process estimation; accelerometry; mortality

04.
arXiv (CS.AI) 2026-06-11

AI4Land: Scalable Deep Learning for Global High-Resolution Land Use Reconstruction

arXiv:2606.11793v1 Announce Type: cross Abstract: Uncertainty in the terrestrial carbon cycle remains a major constraint in climate projections, partly driven by the uncertainties affecting the land surface representation and variability in Earth system models. To address this limitation, we present a data-driven framework AI4Land, for generating high-resolution historical reconstructions and future projections of key land surface variables. The framework follows a two-phase approach using a U-Net architecture. In the first phase, which is the focus of this work, it reconstructs annual land use and land cover by integrating coarse-resolution scenario data with static geophysical features. In a planned second phase, the resulting high-resolution maps will be used to predict dynamic biophysical variables, particularly leaf area index, at finer temporal scales. Trained on Earth observation data, the models learn to reproduce spatially explicit and physically consistent land surface patterns, extending temporal coverage to periods lacking direct observations. AI4Land was developed and trained on MareNostrum5, demonstrating how GPU-accelerated HPC infrastructure enables global-scale climate AI pipelines. The final product is a suite of open-source emulators designed for real-time coupling with digital twin platforms, such as those developed under the Destination Earth initiative. By delivering realistic and evolving land surface conditions on demand, this work aims to reduce critical uncertainties and improve the predictive power of next-generation climate simulations.

05.
arXiv (quant-ph) 2026-06-11

Towards the implementation of a quantum classifier

arXiv:2606.10150v2 Announce Type: replace Abstract: In this work, we investigate the use of a quantum circuit as a binary classification model in the context of quantum machine learning. We call this model, binary quantum classifier. First, we describe fundamental concepts of quantum computing and introduce the computational tool used: Qibo, an open-source framework for efficient quantum simulations and quantum hardware control. Then, we describe how to design a binary quantum classifier for the classification of images and small arrays of variables by showing how to input data in the circuit, defining a quantum circuit model Ansatz with trainable parameters and a loss function, and implementing multiple minimizers. We test our quantum classifier with two data sets. The first one is the MNIST data set which is composed of handwritten digits (reduced to only handwritten zeros and handwritten ones for binary classification). We study the behavior of different minimizers by increasing the number of layers of the Ansatz. The second data set represents two different high energy collisions that can occur at colliders such as LHC (CERN). Due to in-time proton-proton interactions known as pile-up, we distinguish two different data sets: "without pile-up" and "with pile-up". These collisions can be represented by images of size 32x32 or by six high-level variables that we call features. By increasing the size of the training data set and the number of layers of the Ansatz, we search for the best minimizer. Splitting the data set in training set and test set, we compute: ROC curve, AUC score, confusion matrices and test set accuracy. For "with pile-up" images, we compare the results obtained with the quantum classifier with a small convolutional neural network. We conclude that is possible to build a binary quantum classifier with a quantum circuit and we highlight its performances and limitations in comparison with classical technologies.

06.
arXiv (CS.CV) 2026-06-15

Stream3D: Sequential Multi-View 3D Generation via Evidential Memory

View-conditioned 3D generators such as SAM 3D, TRELLIS, and Hunyuan3D produce high-quality object reconstructions from a single view, but real-world visual observation often arrives as long monocular streams. Naively applying these generators to each streaming frame independently leads to severe temporal inconsistency in the generated results. To address this problem, we propose Stream3D, the first training-free streaming mechanism that turns a frozen view-conditioned 3D generator into a streaming generator with constant cross-chunk memory. Stream3D achieves this by maintaining a compact evidential memory, which selectively caches the most informative historical frames based on a proposed evidence score mechanism. As the stream progresses, the memory dynamically updates to retain a fixed number of informative frames, preventing the memory footprint from growing linearly with sequence length. This also prevents degradation over long sequences and keeps the underlying generator completely unchanged without retraining, architectural modifications, or auxiliary losses. Evaluated on both realistic and synthetic streaming benchmarks, Stream3D outperforms latent-transport baselines, including KV-cache reuse and flow-based feature editing, across both photometric and geometric metrics. More details can be found at: https://stream-3d.github.io/stream3d.github.io/.

07.
medRxiv (Medicine) 2026-06-22

Clinical-grade Cuffless Blood Pressure Monitoring via Deep-tissue Diffuse Speckle Pulsatile Flowmetry

Blood pressure (BP) is a vital sign which is measured to diagnose and manage hypertension. However, current methods to measure BP use inflatable cuffs which cause discomfort and limit the frequency at which measurements can be made, or intra-arterial catheters which are invasive and pose infection risks. Here, we propose and evaluate the use of Diffuse Speckle Pulsatile Flowmetry (DSPF) as a cuffless BP measurement method to address these limitations. DSPF is a laser speckle-based technique which simultaneously records blood flow rate and blood volume (i.e. photoplethysmography or PPG) signals from relatively deep vascular tissue. Using information from these signals, we studied DSPFs effectiveness in measuring systolic BP (SBP) and diastolic BP (DBP) through an outpatient study in which 133 patients were recruited, and in measuring beat-to-beat BP waveforms through an inpatient study in which two patients were recruited. In the outpatient study, the DSPF method was able to achieve mean absolute errors (MAEs) of 4.17 mmHg and 2.42 mmHg for SBP and DBP respectively compared to conventional cuff-based methods. It was also able to fulfil the requirements of the AAMI/ESH/ISO 81060-2:2018 standard for BP measurement devices and attain an "A" grade according to the British Hypertension Society grading scheme. For the inpatient study, it produced BP waveforms which had MAEs of 2.35 mmHg and 3.06 mmHg compared to arterial-line measurements for the two patients, respectively. Compared to PPG which has been studied more extensively as a cuffless BP measurement method, we found through ablation studies that DSPF was able to reach significantly lower MAEs and hence better accuracies. DSPF augments the performance of PPG-only methods by leveraging additional information from the blood flow rate signal, and we therefore find it to be a superior cuffless BP measurement method which can potentially be used in outpatient, inpatient, and remote settings.

08.
arXiv (quant-ph) 2026-06-19

Extracting the physical content of Liouvillian eigenmodes: Semiclassical quantization

arXiv:2606.20271v1 Announce Type: new Abstract: Unlike in closed quantum systems where individual energy eigenstates are understood as physical excitations, open quantum systems have distinct right and left eigenstates of the Liouvillian that decay with time and are difficult to interpret. Here we introduce a physically motivated quasiprobability measure combining the two types of eigenstates that interprets a Liouville eigenmode as a set of coherences. This coherence measure is intimately connected to the return probability and allows one to visualize the modes as quasiprobability distributions in a "doubled" phase space. Using this measure we show that, remarkably, an oscillator retains its quantized "orbits" in phase space for a large class of linear and nonlinear damping, thus providing a formulation of semiclassical quantization for open systems. The orbits have measurable dynamical signatures and are broadened in the presence of a thermal bath, similar to energy levels. For quadratic systems, our results yield an extension of the concept of invariant tori, which play a central role in Hamiltonian systems.

09.
arXiv (CS.AI) 2026-06-24

HOLMES: Evaluating Higher-Order Logical Reasoning in LLMs

arXiv:2606.23238v2 Announce Type: replace Abstract: Logical reasoning is essential for reliable AI, yet existing benchmarks are largely first-order-logic-centric, focusing on object-level deduction over fixed predicates. This misses many realistic scenarios where models must reason over rules, predicates, functions, constraints, and decision procedures themselves. We introduce HOLMES (Higher-Order Logic Meets real-world Explainable Symbolic reasoning), the first real-world benchmark for higher-order symbolic reasoning in LLMs, containing 1379 instances. Built on higher-order logic, HOLMES pairs natural-language problems with HOL formalizations, ground-truth answers, verifiable reasoning traces, and fine-grained controllable reasoning factors across law and finance. Experiments show that current LLMs still struggle on HOLMES, with an average accuracy of only 50.64% and the best model reaching 59.54%. Our analyses further reveal that high final-answer accuracy can mask shortcut reasoning in conflict-resolution settings, while performance drops sharply under scope-conditioned and compositional reasoning. These findings identify higher-order symbolic reasoning as a key bottleneck for building reliable and verifiable LLMs. The project code and dataset are publicly available at https://github.com/wuyucheng2002/HOLMES.

10.
arXiv (CS.CV) 2026-06-11

Continual Learning with Support Boundary Experience Blending

Continual learning (CL) seeks to mitigate catastrophic forgetting when models are trained with sequential tasks. A common approach, experience replay (ER), stores past exemplars but only sparsely approximates the data distribution, yielding fragile and oversimplified decision boundaries. We address this limitation by introducing Support Boundary Data (SBD), generated via differential-privacy-inspired noise into latent features to create boundary-adjacent representations that implicitly regularize decision boundaries. Building on this idea, we propose Experience Blending (EB), a framework that jointly trains on exemplars and SBD through a dual-model aggregation strategy. EB has two components: (1) latent-space noise injection to generate support boundary data, and (2) end-to-end training that jointly leverages exemplars and SBD. Unlike standard experience replay, SBD enriches the feature space near decision boundaries, leading to more stable and robust continual learning. Extensive experiments on CIFAR-10, CIFAR-100, Tiny ImageNet, and ImageNet1K demonstrate consistent accuracy improvements of 10%, 6%, 13%, 2%, respectively.

11.
arXiv (CS.LG) 2026-06-16

Forecasting Bacterial Antimicrobial Resistance Trends Using Machine Learning on WHO GLASS Surveillance Data: A Retrieval-Augmented Generation Approach for Policy Decision Support

arXiv:2602.22673v2 Announce Type: replace Abstract: Background: Antimicrobial resistance (AMR) is a global health threat. While the WHO Global Antimicrobial Resistance and Use Surveillance System (GLASS) provides standardized data, population-level machine learning forecasting of resistance trends remains limited. Translating computational forecasts into policy requires transparent interpretation mechanisms. Methods: Surveillance data (2021-2023) comprising 5,909 observations across 44 countries and five WHO regions were processed. A rigorous temporal split prevented data leakage. Six models (Naive, Linear, Ridge, XGBoost, LightGBM, LSTM) were benchmarked to forecast one-year-ahead resistance rates using features including prior-year resistance and antibiotic consumption. Evaluation metrics (MAE, RMSE, sMAPE) were computed, with 95% bootstrap confidence intervals for MAE. A local Retrieval-Augmented Generation (RAG) system utilizing Gemma 4 was implemented to translate forecast findings into policy guidance grounded in retrieved WHO documents. Results: XGBoost achieved the best performance (test MAE = 6.13% [95% CI: 5.83-6.44]), an 85.3% error reduction versus the naive baseline (MAE = 41.79%). SHAP analysis identified prior-year resistance as the dominant predictor (50.5% gain), confirming strong autoregressive behavior. Regional forecast error tracked closely with surveillance coverage, ranging from 3.65% in the European Region to 8.61% in South-East Asia. The RAG pipeline generated accurate, source-attributed policy responses without fabricated citations. Conclusion: Short-term AMR resistance rates exhibit strong temporal autocorrelation that can be accurately forecasted using gradient boosting. Coupling these forecasts with a hallucination-resistant RAG system provides a scalable, evidence-based decision-support framework for AMR governance.

12.
arXiv (quant-ph) 2026-06-25

Detection of patterns in a discrete-outcome sensor network

arXiv:2606.25100v1 Announce Type: new Abstract: A discrete outcome quantum sensor network is one in which we are only interested in which detectors are activated. This can be studied in either the strong or weak interaction regime. If the detectors interact strongly with the environment, it is possible to definitely find which ones were activated. If the interaction is weaker, there is a possibility of making an error, and the object is to minimize the probability of this happening. Here we will be interested in this weaker interaction regime. We will also assume that only certain patterns of detectors will be activated, different patterns being translated versions of a fundamental one. Our object will be to find which pattern has been activated. We will look at both one and two-dimensional detector arrays and make use of techniques from minimum-error state discrimination.

13.
bioRxiv (Bioinfo) 2026-06-23

Model-based inference of gene expression noise from single-cell RNA-sequencing data

The heterogeneity of expression levels among genetically identical cells, termed gene expression noise, is a property of the gene expression process whose importance in the biology of organisms and their evolution is increasingly recognized. Measuring gene expression noise requires single-cell expression data, as obtained from single-cell RNA sequencing (scRNASeq). Its estimation, however, is challenging owing to (i) the presence of technical noise in addition to biological noise, and (ii) the heterogeneity of cell types in the sampled population. We propose a maximum-likelihood framework to infer biological noise from scRNASeq data, while accounting for technical noise, dropout probabilities, and distinct cell sequencing depths. We demonstrate the parameter identifiability using simulations and that the resulting noise estimates are uncorrelated from the mean gene expression, and therefore do not need extra correction in downstream analyses, easing intra- and inter- genome comparisons. Using two technical replicates of scRNASeq data from the wild yeast *Saccharomyces paradoxus*, we show that expression noise can be inferred in a reproducible manner.

14.
PLOS Medicine 2026-06-23

Prevalence and epidemiological patterns of <i>Neisseria gonorrhoeae</i> infection in sub-Saharan Africa, 1964–2025: Systematic review, meta-analyses, and meta-regressions

作者:

by Aisha Osman, Hina Akram, Bayan Alemrayat, Sumaya Al-Maraghi, Manale Harfouche, Laith J. Abu-Raddad Background Neisseria gonorrhoeae (NG) infection is a global health concern because of its morbidity and increasing antimicrobial resistance. Sub-Saharan Africa is believed to carry a disproportionately high burden of NG infection, but the epidemiology of NG infection in this region has not been comprehensively synthesized. This study systematically reviewed and analyzed NG prevalence in sub-Saharan Africa to characterize prevalence patterns and identify populations at risk. Methods and findings A systematic review was conducted and reported following PRISMA guidelines. Embase, PubMed, Scopus, and Web of Science were searched from inception to June 4, 2025. Eligible studies reported NG prevalence in sub-Saharan Africa. Random-effects meta-analyses generated pooled prevalence estimates, and random-effects meta-regression analyses identified associations and sources of heterogeneity.Nine hundred fifty publications contributed 1,604 prevalence measures spanning 1964–2025. In the general population, pooled urogenital prevalence was 3.2% (95% confidence interval (CI): 2.9–3.5), with substantial between-study heterogeneity and a wide prediction interval, indicating considerable variation in prevalence across settings. Prevalence was high in key populations: among female sex workers, 11.5% (95% CI: 9.9–13.2) for urogenital and 2.0% (95% CI: 0.4–4.5) for anorectal infection; and among men who have sex with men, 2.8% (95% CI: 2.4–3.3) for urogenital, 8.3% (95% CI: 5.8–11.0) for anorectal, and 5.7% (95% CI: 3.6–8.3) for oropharyngeal infection. Symptomatic men exhibited high urogenital prevalence (51.5%; 95% CI: 47.5–55.5), and symptomatic women showed 9.0% (95% CI: 7.7–10.4). Among women with adverse pregnancy or birth outcomes, urogenital prevalence was 8.6% (95% CI: 5.3–12.6). Meta-regression analyses explained over half of the variability in prevalence, showing a long-term decline of 1% per year, a clear population type gradient, subregional differences, and decreasing prevalence with increasing age, but no variation by sex. These findings may be affected by variability in data availability across countries, anatomical sites, and population groups, as well as heterogeneity across included studies. Conclusions NG prevalence remains markedly high in this region but has declined over time. These findings highlight the need for strengthened surveillance, expanded prevention and diagnostic strategies, and continued monitoring of gonococcal antimicrobial resistance to support effective control efforts in sub-Saharan Africa.

15.
arXiv (CS.AI) 2026-06-16

VGPT-RSI for RH-Adjacent Formal Progress: Boundary Certificates, Verified Finite Lagarias Inequalities, and Explicit Failure Localization

arXiv:2606.15096v1 Announce Type: new Abstract: The Riemann Hypothesis remains one of the central unsolved problems in mathematics. Rather than claiming proof, we investigate whether a verifiable AI-assisted reasoning system can produce reliable, formally checked partial progress while explicitly identifying the remaining mathematical obstructions. We apply the Verifiable Growing Physical Transformer with Recursive Self-Improvement (VGPT-RSI) to two RH-adjacent certification tasks. First, we construct and verify a finite RH-boundary certificate for inequality on a parameterized safe lower curve over a region. The numerical boundary curve is converted into a certificate-backed lower curve, audited using outward-rounded interval arithmetic and Arb/FLINT ball arithmetic, and then checked in Rocq/CoqInterval for the parameterized theorem. Second, we initiate a formal Lagarias-route certificate. Lagarias criterion states that RH is equivalent to the global inequality. We formalize the finite quantity and produce a Coq-checked finite certificate. The final system identifies the exact unresolved mathematical bottlenecks: formalizing the Lagarias equivalence, proving the global tail theorem beyond any finite cutoff, and potentially reducing counterexamples to colossally abundant or related extremal integers. These results demonstrate that VGPT-RSI can produce certified RH-adjacent formal progress, organize proof dependencies, and avoid overclaiming when the remaining obstruction is genuinely mathematical.

16.
arXiv (CS.CV) 2026-06-25

A Benchmark for Heterogeneous Stereo Deblurring with Physically- and Epipolar-constrained Cross Attention

Modern stereo-capable smartphones enable immersive XR content capture. However, hardware heterogeneity across camera modules often causes severe asymmetric blur artifacts. Existing methods and benchmarks largely assume homogeneous stereo setups and therefore do not explicitly address such asymmetric degradation. To bridge this gap, we present a dedicated framework for heterogeneous stereo deblurring. First, we introduce the heterogeneous stereo deblurring (HSD) dataset, constructed from real smartphone stereo captures via multi-frame integration. Second, we propose physically- and epipolar-constrained cross attention (PECA), a lightweight module that restricts cross-view matching to an epipolar search window bounded by a optics-derived disparity upper bound. By enforcing physically valid disparity constraints, PECA enables efficient and reliable cross-view feature fusion. Moreover, our confidence-weighted attention with residual fusion emphasizes cross-guided deblurring when correspondences are reliable, while naturally falling back to self-deblurring in occluded or unreliable regions. PECA is architecture-agnostic and consistently improves CNN-, Transformer-, and NAFNet-based baselines. Extensive experiments on HSD show that PECA-enhanced models achieve improved restoration performance with favorable efficiency.

17.
arXiv (CS.CL) 2026-06-16

Koshur Diacritizer: A Byte-Level Sequence-to-Sequence Model for Kashmiri Diacritic Restoration

Kashmiri, an Indo-Aryan language written in a modified Perso-Arabic script, frequently omits diacritic marks in digital text, creating ambiguity and challenging downstream NLP applications. We present Koshur Diacritizer, a ByT5-small byte-level sequence-to-sequence model for restoring diacritics in Kashmiri text. To support this task, we release a publicly available dataset of 23.7k aligned undiacritized diacritized Kashmiri sentence pairs. The proposed framework combines script-aware normalization, alignment validation, and skeleton-preserving inference to ensure reliable restoration while maintaining the original base-letter sequence. Experimental results on a held-out test set achieve a DERm of 0.2012 and a WER of 0.2159. Additionally, evaluation by a native Kashmiri linguistic expert yields a mean accuracy of 77.5%. The dataset, model, and source code are publicly released to provide a reproducible baseline for Kashmiri diacritic restoration and future low-resource language research.

18.
arXiv (quant-ph) 2026-06-15

Tamed Feynman-Kac diffusion processes: Killing-branching intertwine

arXiv:2605.07824v2 Announce Type: replace-cross Abstract: Relaxation to equilibrium of a drifted Brownian motion is quantified by a transition probability density function, whose main (multiplicative) entry is an inferred Feynman-Kac kernel of the Schr\"{o}dinger semigroup operator. Although seemingly devoid of a natural probabilistic significance (except for its explicit path integral definition), the pertinent kernel relaxes to equilibrium as well. The implicit Feynman-Kac potential ${\cal{V}}(x)$, continuous, confining and bounded from below, may take negative values. If positive, ${\cal{V}}(x)$ can be interpreted as the killing rate of the decaying diffusion process. In case of relaxing F-K kernels the killing effects are tamed (often overcompensated). The taming inavoidably appears in conjunction with the existence of the negativity subdomains of ${\cal{V}}(x)$ in $R$. If locally ${\cal{V}}(x) < 0$, its sign inversion $- {\cal{V}}(x)$ can be interpreted as the branching (cloning, alternatively bifurcation) rate in the course of the other wise free random motion. The arising killed diffusion processes with branching, we interpret as the possible path-wise background of tamed (relaxing) Feynman-Kac diffusions. We present acomputer-assisted path-wise arguments, towards a consistency of the killing/branching taming scenario, for a number of nonlinear model systems in one space dimension. Special attention is paid to Feynman-Kac potential shapes in the double well form, where an analytic access to eigenvalues and eigenfunctions is scarce. Throughout the paper the dynamics refers to the positive real time. Since the Newton-type equations of motion for admissible classical trajectories have a Euclidean form (due to the sign inverted force term), we give a brief resume of a couple of their explicit solutions, without recourse to the Euclidean time intuitions, and the instanton lore of related quantum model systems.

19.
arXiv (CS.AI) 2026-06-15

COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

arXiv:2512.02318v4 Announce Type: replace-cross Abstract: This paper studies how multimodal large language models (MLLMs) undermine the security guarantees of visual CAPTCHA. We identify the attack surface where an adversary can cheaply automate CAPTCHA solving using off-the-shelf models. We evaluate 7 representative MLLMs on 18 real-world CAPTCHA task types, measuring single-shot accuracy, success under limited retries, end-to-end latency, and per-solve cost. We further validate our findings through a supplemental external dataset and an adaptive-attacker setting with session memory, while also analyzing the impact of task-specific prompt engineering and few-shot demonstrations on solver effectiveness. We reveal that MLLMs can reliably solve recognition-oriented and low-interaction CAPTCHA tasks at human-like cost and latency, whereas tasks requiring fine-grained localization, multi-step spatial reasoning, or cross-frame consistency remain significantly harder for current models. By examining the reasoning traces of such MLLMs, we investigate the underlying mechanisms of why models succeed/fail on specific CAPTCHA puzzles and use these insights to derive defense-oriented guidelines for selecting and strengthening CAPTCHA tasks. To validate these principles, we present a proof-of-concept by hardening a vulnerable CAPTCHA type using our guidelines. We demonstrate that incorporating fine-grained localization and implicit counting reduces the success rate of state-of-the-art MLLMs from over 95\% to 0\%, confirming that structural changes can effectively mitigate the threat. We conclude by emphasizing the urgent need for CAPTCHA redesign as MLLM capabilities increasingly threaten existing defenses. Code Availability (https://doi.org/10.5281/zenodo.20406852).

20.
arXiv (quant-ph) 2026-06-16

Against probability: A quantum state is more than a list of probability distributions

arXiv:2601.18872v2 Announce Type: replace Abstract: The state of a quantum system can be represented by listing the outcome probabilities for a tomographically complete set of measurements. Such representations appear throughout physics, for example, in quantum field theory via correlation functions and in quantum foundations within generalized probabilistic frameworks. In this paper, we show a no-go result: To enable useful statements, the probability representation must be topologically robust$\unicode{x2014}$preserving the notion of closeness between states. Yet, a topologically robust probability representation cannot simultaneously retain other essential structure, such as the subsystem structure.

21.
arXiv (CS.LG) 2026-06-15

Conformal calibration and look-elsewhere effect in anomaly detection for new-physics searches

arXiv:2606.13780v1 Announce Type: cross Abstract: Machine-learned anomaly detection is reshaping searches for new physics, but it has outrun the statistics used to interpret it. A raw anomaly score has no calibrated meaning, a model that scans many regions inflates the look-elsewhere effect, and the asymptotic significances the field relies on are blind to the background mismodelling that anomaly detectors are especially prone to. We propose a calibration layer, built on conformal prediction, that turns any anomaly score into a defensible significance with distribution-free, finite-sample guarantees. Conformal prediction converts scores into valid local p-values, weighted and Mondrian variants repair the sideband-to-signal-region exchangeability failures that resonant searches suffer, and a Gross-Vitells step carries the result through to a look-elsewhere-aware global significance. The layer does two things at once. It exposes miscalibration that the standard pipeline cannot see, and it corrects it without retraining the detector. On public LHC Olympics data, a classifier develops a substructure-mass correlation that makes sideband-calibrated background p-values anti-conservative. Taken at face value, this manufactures a $\sim 46\sigma$ excess from background sculpting alone, which the label-free weighted correction removes, restoring an honest null. When run as a blind wide-mass bump hunt, the standard asymptotic and unweighted procedures fabricate $\gtrsim10\sigma$ excesses and $\approx5\sigma$ excesses even in signal-free windows, while the conformal layer raises no false alarms and its global false-positive rate is verified on background-only pseudoexperiments. The result is an auditable, detector-agnostic path from an uncalibrated score to a trials-factor-aware significance, ready to be folded into experimental anomaly searches.

22.
arXiv (CS.CV) 2026-06-11

Cross-Domain Multi-Person Human Activity Recognition via Near-Field Wi-Fi Sensing

Wi-Fi-based human activity recognition (HAR) provides substantial convenience and has emerged as a thriving research field, yet the coarse spatial resolution inherent to Wi-Fi significantly hinders its ability to distinguish multiple subjects. By exploiting the near-field domination effect, establishing a dedicated sensing link for each subject through their personal Wi-Fi device offers a promising solution for multi-person HAR under native traffic. However, due to the subject-specific characteristics and irregular patterns of near-field signals, HAR neural network models require fine-tuning (FT) for cross-domain adaptation, which becomes particularly challenging with certain categories unavailable. In this paper, we propose WiAnchor, a novel training framework for efficient cross-domain adaptation in the presence of incomplete activity categories. This framework processes Wi-Fi signals embedded with irregular time information in three steps: during pre-training, we enlarge inter-class feature margins to enhance the separability of activities; in the FT stage, we innovate an anchor matching mechanism for cross-domain adaptation, filtering subject-specific interference informed by incomplete activity categories, rather than attempting to extract complete features from them; finally, the recognition of input samples is further improved based on their feature-level similarity with anchors. We construct a comprehensive dataset to thoroughly evaluate WiAnchor, achieving over 90% cross-domain accuracy with absent activity categories.

23.
arXiv (CS.AI) 2026-06-24

RetiSEM: Generalising Causal Models for Fragmented Biomedical Data

arXiv:2606.24488v1 Announce Type: cross Abstract: Learning causal models from fragmented biomedical data is challenging because clinical, molecular, and imaging variables are often incomplete or not jointly observed. We propose RetiSEM, a domain-constrained structural equation modelling (SEM) framework for causal graph recovery and mediation analysis under limited multimodal resources. This proposed work organises variables into biologically informed blocks, applies forbidden-edge constraints, and decomposes pathway-level effects into TE, NDE, and NIE components. We evaluate RetiSEM across ten synthetic benchmark scenarios that vary in dimensionality, nonlinearity, causal depth, and pathway structure, together with a fragmented real-world setting that combines NHANES clinical variables with externally derived retinal representations. This approach achieves lower structural error and higher causal accuracy than unconstrained baselines across the synthetic benchmarks. In the real-data analysis, retinal variables behave mainly as downstream biomarker-like indicators, with smaller but detectable indirect effects. These findings support our strategy as an interpretable framework for testing structured causal hypotheses in limited-resource biomedical AI. The code and resources for this work are publicly available at: https://github.com/Inamullah-Colab/ReitSEM.

24.
arXiv (CS.CV) 2026-06-16

MolSight: Molecular Property Prediction with Images

Every molecule ever synthesised can be drawn as a 2D skeletal diagram, yet in modern property prediction this universally available representation has received less focus in favour of molecular graphs, 3D conformers, or billion-parameter language models, each imposing its own computational and data-engineering overhead. We present $MolSight$, the first systematic large-scale study of vision-based Molecular Property Prediction (MPP). Using 10 vision architectures, 7 pre-training strategies, and $2\,M$ molecule images, we evaluate performance across 10 downstream tasks spanning physical-property regression, drug-discovery classification, and quantum-chemistry prediction. To account for the wide variation in structural complexity across pre-training molecules, we further propose a $chemistry-informed curriculum$: five structural complexity descriptors partition the corpus into five tiers of increasing chemical difficulty, consistently outperforming non-curriculum baselines. We show that a single rendered bond-line image, processed by a vision encoder, is sufficient for competitive molecular property prediction, i.e. $chemical insight from sight alone$. The best curriculum-trained configuration achieves the top result on $5 of 10$ benchmarks and top two on $all 10$, at $$80$\times$ lower$$ FLOPs than the nearest multi-modal competitor.

25.
arXiv (CS.CL) 2026-06-24

A P\={a}ninian Foundation for Indic Language Processing

More than a billion people communicate in Indic languages, yet the natural language processing infrastructure serving them remains fragmented and underdeveloped. The cause is structural: the field organizes its tools and benchmarks around individual languages or small subsets of genealogical language families, building separate analyzers, parsers, and datasets for each language and starting over for the next. This overlooks a deep regularity. Through more than two millennia of convergence around Sanskrit, Indic languages came to share a morphosyntactic architecture formalized in P\={a}nini's grammar, the Ast\={a}dhy\={a}y\={i}. This cuts across genealogical lines, uniting languages through a common framework. We argue that this P\={a}ninian framework supplies a unifying computational architecture the field has lacked, and that benchmarks grounded explicitly in it would make Indic language systems more accurate, more data-efficient, and more transferable, effectively merging many apparently disparate and sparse Indic language resources into a single high-resource metalanguage bedrock. We propose a four-part benchmark suite to render this shared architecture explicit, measurable, and ready to be leveraged for practical applications. Moreover, we underscore the question it raises for interpretability research: whether neural models trained on these languages come to represent P\={a}nini's categories on their own.