Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.LG) 2026-06-11

Point-Identification of a Robust Predictor Under Latent Shift with Imperfect Proxies

arXiv:2603.15158v2 Announce Type: replace Abstract: Addressing the domain adaptation problem becomes more challenging when distribution shifts across domains stem from latent confounders that affect both covariates and outcomes. Existing proxy-based approaches that address latent shift rely on a strong completeness assumption to uniquely determine (point-identify) a robust predictor. Completeness requires that proxies have sufficient information about variations in latent confounders. For imperfect proxies the mapping from confounders to the space of proxy distributions is non-injective, and multiple latent confounder values can generate the same proxy distribution. This breaks the completeness assumption and observed data are consistent with multiple potential predictors (set-identified). To address this, we introduce latent equivalent classes (LECs). LECs are defined as groups of latent confounders that induce the same conditional proxy distribution. We show that point-identification for the robust predictor remains achievable as long as multiple domains differ sufficiently in how they mix proxy-induced LECs to form the robust predictor. This domain diversity condition is formalized as a cross-domain rank condition on the mixture weights, which is substantially weaker assumption than completeness. We introduce the Proximal Quasi-Bayesian Active learning (PQAL) framework, which actively queries a small, targeted set of diverse domains that satisfy this rank condition. PQAL can recover the point-identified predictor, demonstrates robustness to varying degrees of shift and outperforms previous methods on synthetic data and semi-synthetic dSprites, IHDP, ACS Folktables datasets.

02.
arXiv (CS.LG) 2026-06-17

Sign-Rank, Index, and List Replicability: Connections and Separations

arXiv:2606.18236v1 Announce Type: new Abstract: In learning theory, the sign rank of a binary concept class captures the smallest dimension in which it can be represented by points and halfspaces. Despite tremendous interest, lower bounds on sign rank are notoriously difficult to come by. Two recent approaches to the problem establish lower bounds on sign rank by measures that are easier to analyze: the $\mathbb{Z}_2$-index and the list replicability number. We order these measures, showing that the $\mathbb{Z}_2$-index is upper-bounded by a linear function of the list replicability number. As a main consequence, we obtain a strong separation between sign rank and $\mathbb{Z}_2$-index, thereby resolving a question of Frick, Hosseini, and Vasileuski. This motivates a thorough study of list replicability, the stronger of the two lower-bounding measures. We establish upper bounds on the list replicability number by two combinatorial measures: height and minimum star number. We also prove a fundamental composition result, showing that the product of two concept classes has list replicability number bounded by the sum of the list replicability numbers of the two classes.

03.
medRxiv (Medicine) 2026-06-15

Neural Correlates of Human Food Memory link to Microbial, Homeostatic, and Hedonic Signals: Evidence from a Prebiotic Randomized Clinical Trial

Background Homeostatic and hedonic brain circuits regulate eating behavior but also shape how food memories are encoded and retrieved. Objective We examined neural correlates during food memory encoding and retrieval during functional MRI before and after a 14-day prebiotic intervention in a preregistered, double-blind crossover trial (NCT03829189). Design 55 healthy adults with overweight (19 females, age 28{+/-}6.5, BMI 25-30 kg/m2) underwent 3 Tesla task-based functional MRI before and after dietary intervention of prebiotic (30g inulin/day) or equicaloric placebo for 14 days. Peripheral metabolic, short-chain fatty acids (SCFA), and microbial markers using 16S rRNA analysis were assessed in fasting blood and feces. Results Food memory was enhanced by assigned reward value and engaged brain activity in hedonic regions, including the nucleus accumbens, orbitofrontal cortex, caudate, cingulate, dorsomedial prefrontal cortex, and ventral tegmental area, as well as homeostatic and memory-related such as the hypothalamus and the hippocampus. Higher neural activations during food encoding were related to higher Actinobacteriota abundance, fecal SCFA acetate, and creatinine levels, and lower ghrelin levels. Activations in reward-related and homeostatic brain areas partially correlated with insulin, glucagon-like peptide-1, leptin, and thyroid-stimulating hormone levels. Neural activations related to food memory decreased after prebiotic intervention. The prebiotic supplementation induced decrease of hippocampal activity during food encoding related to changes in gut microbiota Firmicutes abundance. Conclusions This study indicates that neuronal food-related memory processes depend on homeostatic and hedonic brain signals modulated by the gut-brain axis. Our findings raise implications for the treatment of obesity and substance use disorder.

04.
arXiv (quant-ph) 2026-06-19

Space-time duality approach to (inhomogeneous) integrable quenches

arXiv:2606.20445v1 Announce Type: cross Abstract: Characterising the universal aspects of non-equilibrium quantum many-body dynamics is one of the key goals of this century's physics research. Progress, however, is hindered by the lack of general theoretical frameworks for studying interacting quantum matter far from equilibrium. A recent breakthrough has been the realization that several key non-equilibrium quantities, such as the rate of growth of entanglement or the fluctuations of conserved charges within finite subsystems, can be related to equilibrium properties through a space-time duality that effectively exchanges the roles of space and time. This observation effectively enables the study of non-equilibrium phenomena using tools and concepts borrowed from equilibrium statistical mechanics and thermodynamics. A first proof of principle of this framework, dubbed space-time duality approach (SDA), was provided by interacting integrable systems, where thermodynamic properties can often be characterized exactly, while dynamical quantities typically remain beyond analytical reach. Subsequent developments, however, revealed that the SDA suffered from an intrinsic ambiguity, restricting its applicability to homogeneous quenches and to charge fluctuations arising from symmetric initial states. Here we resolve this ambiguity from first principles and derive closed-form predictions for entanglement growth and charge fluctuations after general quantum quenches. We benchmark our results against the exact analytical solution of the Rule 54 quantum cellular automaton and extensive TEBD simulations of the XXZ chain. Moreover we show that, when specialised to the entanglement entropy, our framework naturally reproduces the predictions of the quasiparticle picture.

05.
arXiv (CS.LG) 2026-06-15

Beyond a Single Explanation of the Adam–SGD Gap

arXiv:2606.14259v1 Announce Type: new Abstract: Prior work has identified several factors that can contribute to the performance gap between Adam and SGD, spanning data aspects, architecture design, and optimization properties. Yet these explanations are often studied in isolation, leaving their relative importance unclear. In this work, we revisit these hypotheses through a controlled empirical study across vision, language, genomics, and graph tasks, spanning modern and classical architectures, and carefully designed training setups. Our results suggest that no single factor consistently explains the Adam–SGD gap. For instance, the Adam advantage can (1) persist under a uniform vocabulary distribution yet nearly disappear under a heavy-tailed one; (2) reverse in favor of SGD in softmax-attention models; and (3) become larger under soft architectural modifications, e.g., when ReLU is replaced by a GeLU nonlinearity. This suggests that the gap arises from nontrivial data and architecture interactions, rather than from a single common factor. Yet, we observe a pattern across our settings: a crossover batch size at which the relative advantage shifts from SGD to Adam as the batch size scales. These empirical results are captured by our theoretical gap model, which predicts this batch-size-dependent crossover. Our perspective helps reconcile several existing hypotheses while offering practical insights across domains.

06.
arXiv (CS.LG) 2026-06-17

Continual Self-Improvement with Lightweight Experiential Latent Memories

arXiv:2606.17803v1 Announce Type: new Abstract: Large language models achieve strong reasoning performance by scaling inference-time compute, yet remain fundamentally stateless, discarding the rich, self-produced reasoning traces generated during this process. We investigate whether models can instead learn online from this experience, converting transient computation (reasoning traces) into persistent reusable knowledge, and without external supervision or access to future data. We show that In-Context Learning (ICL) over raw reasoning traces fails to generalize, reflecting a fundamental limitation of token-level reuse: individual traces lack the abstraction needed for transfer, even after refinement (e.g. self-reflection). In contrast, drawing inspiration from recent works on unsupervised reinforcement learning, we find that lightweight per-instance training with self-generated test-time signals (majority voting) as rewards yields substantial gains, often surpassing full-dataset offline training, motivating a shift from raw traces to learned latent representations. Building on this insight, we propose an online method that distills inference-time compute spent on encountered problems into compact modular latent memories capturing the underlying reasoning structure. These memories are stored and retrieved for future inputs, enabling continual improvement while avoiding catastrophic forgetting through modular design. Importantly, our method is highly efficient, parametrized as extremely lightweight soft prompt memories (~0.001% of model parameters) and trained with only a few gradient steps, yet achieving performance competitive with full parametric updates and offline training. Across challenging mathematical reasoning benchmarks, our approach significantly outperforms zero-shot and raw data ICL baselines, while transferring effectively across datasets.

07.
arXiv (quant-ph) 2026-06-12

Efficient certification of intractable quantum states with few Pauli measurements

arXiv:2511.07300v2 Announce Type: replace Abstract: Efficient verification of quantum computational resources is crucial as experiments advance toward fault-tolerance. Universal quantum computation can be achieved by consuming resource states through simple Pauli measurements, yet a significant gap remains between states that are easy to certify and those required for universality. We focus on Clifford-enhanced Product States, a class of resource states obtained by applying Clifford circuits to a product of single-qubit, potentially magic, states. While essential for universal computation, the certification of such states has previously relied on query oracles that are \#P-hard to implement, leaving their efficient, oracle-free verification an open challenge. In this work, we demonstrate that such classically intractable resource states can be efficiently verified using only Pauli measurements. Our protocol achieves sample- and time-efficiency in both i.i.d.\ and adversarial settings. This work fills a gap in Pauli-based certification, providing a new practical pathway to verify resource states that drive universal Pauli-based quantum computation.

08.
arXiv (CS.CV) 2026-06-17

Learning a Maximum Entropy Model for Visual Textures using Diffusion

Visual textures – spatially homogeneous image regions containing repeated elements (e.g. a field of grass, the bark of a tree) – are ubiquitous in visual scenes and provide important cues for recognizing and analyzing materials and objects. A number of existing texture models extract essential statistics from a single texture image, and can then generate high-quality samples that are visually similar to the original by matching these statistics. However, their statistics are either hand-designed or based on a network pretrained for another purpose (e.g., object recognition). Here, we develop the first principled method for unsupervised learning of a set of statistics that are used to constrain a maximum entropy probability model. We leverage methods developed for generative diffusion models to derive training and sampling procedures, and compare these to the traditional method of sampling via matching the statistics. Despite the compactness of our trained model (512 statistics), it generates texture images whose quality is as good as or better than the current state-of-the-art model (~177k statistics). A more direct comparison of the two models, obtained by synthesizing images that are indistinguishable for one model but maximally different for the other, reveals their relative strengths and weaknesses. Finally, we show that unlike previous statistical texture models, a straight trajectory in the representation space of our model generates homogeneous texture samples that interpolate smoothly between the features of the two end points.

09.
arXiv (quant-ph) 2026-06-17

Robust Spin Splitting and Strain-Controlled Optical Response in Monolayer CrC2N4 for Valleytronic and Optoelectronic Applications

arXiv:2606.17329v1 Announce Type: cross Abstract: Monolayer CrC2N4 recently emerged as a promising two-dimensional semiconductor, yet its spin-orbit-coupled (SOC) physics and strain-tunable optical response remained largely unexplored. Here, we investigated the electronic, valley, charge-transfer, and optical properties of pristine and biaxially strained monolayer CrC2N4 using first-principles calculations. The monolayer exhibited a direct band gap at the K/K' valleys. SOC produced valley contrasting out-of-plane spin polarization, yielding a moderate valence band spin splitting of 51.9 meV and a small conduction band spin splitting of 1.7 meV. Orbital-resolved analysis showed that the edge states were mainly governed by Cr-d and N-p hybridization, while Bader analysis indicated polar-covalent bonding through charge transfer toward N atoms. Biaxial strain in the range of -4% to +4% tuned the band gap from 1.987 to 1.421 eV and drove an indirect-to-direct gap transition near -1% strain. Tensile strain enhanced the Berry curvature and red-shifted the optical response toward the visible-near-infrared region. These results suggested monolayer CrC2N4 as a promising platform for strain-engineered valleytronic and optoelectronic device applications.

10.
arXiv (CS.AI) 2026-06-15

Learning High Coverage Discriminative Parsimonious Rulesets

arXiv:2606.14156v1 Announce Type: cross Abstract: Learning systems based on IF-THEN rule representations readily offer interpretability, making them a crucial focus in contemporary AI research. A key objective for such rule sets is to achieve both high discriminative power and interpretability. While existing state-of-the-art algorithms implicitly prioritize predictive accuracy, they often fall short on one or more quality metrics that ensure interpretability, such as coverage and parsimony of rule sets. Motivated by this, this paper propose the development of CDPR, which aims to create highly accurate and interpretable rule sets for classification problems. To the best of our knowledge, this represents the first attempt to establish such an approach. In this study, we introduce two algorithms rooted in submodular maximization, which not only provide provable guarantees on coverage but also yield rule sets that are both discriminative and parsimonious. We empirically demonstrate that rule sets learned through our approaches achieve higher accuracy and interpretability and has more than a 2.5-fold improvement in average coverage rates when compared to the next best algorithm.

11.
Nature (Science) 2026-06-17

These ‘master’ proteins protect us from deadly mutations — and could inspire new drugs

Authors:

Biology has clever ways to mask the effects of potentially harmful gene mutations. Scientists are investigating how this ‘buffering’ works — and how to exploit it. Biology has clever ways to mask the effects of potentially harmful gene mutations. Scientists are investigating how this ‘buffering’ works — and how to exploit it.

12.
arXiv (CS.CV) 2026-06-16

Leptomeningeal Collateral Detection on DSA via Vessel-Graph Neural Networks

Leptomeningeal collaterals (LMCs) are an important prognostic factor in acute ischemic stroke. Existing automated methods rely on CT angiography (CTA), but individual LMCs are often too small to be resolved on CTA, limiting these methods to coarse collateral scoring. Digital subtraction angiography (DSA) visualizes individual collaterals at superior resolution, yet current assessment remains subjective, relying on manual grading scales that suffer from poor inter-rater agreement. We present a framework that formulates collateral detection as the classification of individual vessel segments on a graph derived from DSA. A hybrid graph-pixel architecture combines a topology-aware graph branch with a dense pixel branch, fused in a shared node-probability space. In a five-fold cross-validation setting, the fused model achieves a PR-AUC of 0.434, outperforming the graph-only (0.403) and pixel-only (0.362) baselines. To our knowledge, this is the first method to enable the individualization of LMCs in DSA, allowing for precise per-vessel quantitative assessment. This integration shifts DSA assessment toward objective evaluation, supporting future biomarker and pattern discovery for individual LMCs.

13.
arXiv (CS.CL) 2026-06-11

Geometric Metrics and LLMs: What They Measure and When They Work

We present a systematic stress-test of geometric metrics for LLM evaluation. Rank-based geometric properties of internal representations have shown promise as reference-free quality signals, but the conditions under which they are reliable remain unclear. We evaluate eight commonly-used metrics: intrinsic-dimensionality estimators, spectral norms, and related quantities across six tester models (0.5-8B) and eight generators on contrasting tasks, separating genuine geometric signal from text-length effects and from what standard text statistics already capture. Three findings emerge. First, some metrics (notably Schatten Norm and MOM) mainly reflect output length, and their apparent discriminative power collapses once length is controlled. Second, geometric metrics add modest but real information beyond text statistics: combined with them, a classifier reaches 78% accuracy on 6-way generator identification versus 69% for text statistics alone. Third, rather than tracking a general notion of text quality, the metrics demonstrate only moderate association between the intrinsic-dimensionality and lexical diversity (RTTR). We give use-case-specific recommendations and identify failure detection as the most promising near-term application.

14.
arXiv (CS.LG) 2026-06-16

Priority-Aware Shapley Value

arXiv:2602.09326v2 Announce Type: replace Abstract: Shapley values are widely used for model-agnostic data valuation and feature attribution, yet they implicitly assume contributors are interchangeable. This can be problematic when contributors are dependent (e.g., reused/augmented data or causal feature orderings) or when contributions should be adjusted by factors such as trust or risk. We propose Priority-Aware Shapley Value (PASV), which incorporates both hard precedence constraints and soft, contributor-specific priority weights. PASV is applicable to general precedence structures, recovers precedence-only and weight-only Shapley variants as special cases, and is uniquely characterized by natural axioms. We develop an efficient adjacent-swap Metropolis-Hastings sampler for scalable Monte Carlo estimation and analyze limiting regimes induced by extreme priority weights. Experiments on data valuation (MNIST/CIFAR10) and feature attribution (Census Income) demonstrate more structure-faithful allocations and a practical sensitivity analysis via our proposed "priority sweeping".

15.
arXiv (quant-ph) 2026-06-19

QPU-scale randomized benchmarking via Bell-pair injection

arXiv:2606.20123v1 Announce Type: new Abstract: Mirror randomized benchmarking (MRB) is an established technique that provides a global error metric at the scale of a whole QPU. To expand upon this we introduce Mirror Quantum Awesomeness (MQA), a hybrid protocol that adds a structured entangling layer to MRB circuits. This enables per-edge correlation dynamics to be tracked via mutual information while preserving the MRB infidelity estimate. The resulting analysis of the injected entangled pairs locates a critical circuit depth, beyond which rudimentary error mitigation techniques can be expected to fail. A topological variant, Topological MQA, supplies a second critical depth via a decoder based on the surface-code decoding problem. Both are validated in simulation and demonstrated on the 156-qubit \texttt{ibm\_fez} and \texttt{ibm\_kingston} processors, where MQA closely agrees with MRB on the entanglement infidelity and the critical depth for \texttt{ibm\_fez} is found to be $\sim 50$.

16.
arXiv (CS.CV) 2026-06-17

Bayesian Magnetic Resonance Joint Image Reconstruction and Uncertainty Quantification using Sparsity Prior Models and Markov Chain Monte Carlo Sampling

We propose a novel framework for uncertainty quantification using compressed sensing magnetic resonance image reconstruction. The problem is formulated within a Bayesian framework as a linear inverse problem, with prior distributions assigned to the unknown model parameters. Specifically, the image to be reconstructed is assumed to be sparse in a given basis. We develop a general framework applicable to any basis and as examples, we test the sparsity of the image in its (1) spatial gradients using a total variation prior model, and in its (2) wavelet transform. A Markov chain Monte Carlo (MCMC) method, based on a split-and-augmented Gibbs sampler, is then employed to sample from the posterior distribution of the unknown parameters. The non-differentiable conditional distributions are efficiently sampled using a proximal MCMC method. The proposed algorithms are validated on both single-coil and multi-coil datasets using various k-space sub-sampling patterns and ratios. The results demonstrate the superior performance of each proposed approach in reconstructing images compared to its counterpart optimisation-based method. Moreover, our framework effectively quantifies uncertainty, showing a notable correlation between estimated uncertainty maps and error maps computed using ground truth and reconstructed images, compared with existing deep learning-based methods.

17.
arXiv (CS.CL) 2026-06-18

Trust Region On-Policy Distillation

On-Policy Distillation (OPD) is a fundamental technique for efficient post-training of large language models (LLMs), with broad applications in agent learning, multi-task enhancement, and model compression. However, OPD training becomes unstable when the teacher and student distributions differ substantially, as teacher supervision on student-generated tokens may yield unreliable policy gradients and even cause optimization failure. This work addresses reliable on-policy token-level supervision through credit assignment strategies, and proposes Trust Region On-Policy Distillation, TrOPD. It features the following characteristics: 1) Trust-Region On-Policy Learning: TrOPD performs OPD only in regions where the teacher provides reliable supervision, mitigating the optimization difficulty of the K1 reverse-KL estimator under distribution mismatch. 2) Outlier Estimation: For outlier regions, we explore gradient clipping, masking, and forward-KL estimation to reduce the adverse effects of unreliable supervision. 3) Off-Policy Guidance: The student continues generation from teacher prefixes and uses forward KL to imitate off-policy guidance, encouraging on-policy exploration toward reliable regions. Experiments show that TrOPD consistently outperforms SoTA OPD baselines, including OPD, EOPD, and REOPOLD, across mathematical reasoning, code generation, and general-domain benchmarks.

18.
arXiv (CS.CL) 2026-06-12

Does AI Reviewer See the Full Picture? Attacking and Defending Multimodal Peer Review

The integration of Large Language Models (LLMs) and Multimodal LLMs (MLLMs) into scientific peer-review workflows introduces novel and significant risks for adversarial manipulation, especially given the multimodal nature of scientific papers where figures, not just text, convey core evidence. This creates a significant gap: current robustness studies on AI peer-review are overwhelmingly text-only. Moreover, the problem is distinct from standard jailbreaking, as a peer-review attack seeks to induce a domain-specific, targeted failure (e.g., "inflate this score") rather than a general safety policy violation, for which no practical defenses exist. To address this, we introduce PaperGuard, the first comprehensive benchmark designed to systematically evaluate and defend AI-generated peer-review against these domain-specific, cross-modal attacks. Our framework is built on three pillars: (1) a new multimodal peer-review dataset spanning multiple scientific domains; (2) a unified suite of attacks, including black-box prompt injections and white-box perturbations, specifically designed to target both text (GCG) and figures (PGD); and (3) a practical defense, motivated by the long-context challenge of academic papers, that uses chunk-based embedding search to efficiently localize and mitigate harmful instructions. Our extensive experiments, conducted across state-of-the-art models, confirm that AI reviewers are pervasively vulnerable. PaperGuard establishes the foundational benchmark, protocols, and actionable defense necessary to pioneer trustworthy, attack-resilient AI-assisted scholarly reviewing.

19.
arXiv (CS.CV) 2026-06-18

GUMP-Net: An interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation

Pelvic segmentation is one of the most important and fundamental research problems in precise and intelligent diagnosis and treatment, as well as surgical planning and navigation for pelvic fractures. By combining an improved geodesic active contour model with deep neural networks, we propose GUMP-Net, an interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation, in which three network modules are designed to constitute the overall segmentation framework together: the object detection module for automatic level set initialization, the edge detector module for learning an anatomy-aware edge detector function and the iteration module for deep level set evolution. Leveraging the advantages of level set representation and deep learning, GUMP-Net shows more accurate, robust and consistent segmentation performance, especially in small training data situation, compared to the state-of-the-art methods. Extensive experiments on pelvic datasets demonstrate the rationality and effectiveness of the proposed algorithm. Further experiments extended to ankle dataset indicate broader applications to other anatomies. The proposed algorithm not only provides an efficient segmentation method for complex fracture reduction, but also gives an interpretable geometric perspective for understanding deep learning segmentation.

20.
arXiv (CS.CV) 2026-06-18

Moving Beyond Diversity: Visual Token Pruning as Subspace Reconstruction for Efficient VLMs

Despite their remarkable performance, Vision Language Models (VLMs) incur substantial computational overhead due to the large number of visual tokens. While diversity maximization has become a dominant strategy for token reduction, existing methods rely on cosine-based normalized similarity that discards magnitude information, failing to faithfully approximate the original feature representation and leading to suboptimal performance, particularly on compositional multi-skill reasoning tasks. In this paper, we introduce SPARE, a subspace reconstruction method that reformulates token pruning as a column subset selection problem and explicitly minimizes reconstruction error. By iteratively selecting tokens with large projection residuals, SPARE performs reconstruction-driven pruning beyond angular diversity. Moreover, we reveal a counterintuitive anti-relevance phenomenon: tokens with lower image-text relevance score can better preserve contextual information. Based on this finding, we incorporate anti-relevance into SPARE as an additional selection criterion to promote context-aware token selection. Extensive experiments across multiple VLMs and benchmarks demonstrate that SPARE consistently achieves state-of-the-art performance, with strong gains on compositional tasks. When applied to LLaVA, SPARE removes up to 94% of visual tokens while retaining 95% of the baseline performance, all in a fully training-free manner.

21.
bioRxiv (Bioinfo) 2026-06-11

STITCH links cellular morphology and gene expression in spatial transcriptomics

In situ spatial (ISS) sequencing can uncover co-variation between cellular morphology and gene expression in vivo. However, a principled and interpretable mathematical representation of morphology has not yet been applied in this context. In particular, current deep learning-based representations of cell images confound a cell's shape with its size. We present an interpretable representation of cellular boundary contours, based on tangent principal component analysis (TPCA) in a Kendall shape manifold, that captures size-independent contour shape features. This approach successfully recovers shape-perturbing genes in an RNAi screen than a previous metric geometry-based approach. We build on TPCA to develop STITCH (Shape-TranscriptomIc Correlation and Harmonization), an approach to reveal covariation between cell morphology with gene expression in ISS datasets. In a Xenium dataset, STITCH outperforms a deep learning-based approach in both recovering the layered organization of keratinocytes and a spatial gradient in nuclear eccentricity. Across samples in a melanoma CosMx dataset, STITCH reproducibly associates elongated and triangular fibroblasts with proximity to malignant cells and myofibroblast-like transcriptional program. Finally, STITCH independently recovers a known link between mesenchymal-like malignant cell states and increased cell area in two melanoma cohorts. STITCH can thus yield interpretable morphology-transcriptome relationships across cell types, patients, and spatial transcriptomics platforms.

22.
arXiv (math.PR) 2026-06-18

Metastability for the Curie-Weiss-Potts model with unbounded random interactions

arXiv:2505.11260v2 Announce Type: replace Abstract: We analyse the metastable behaviour of the disordered Curie–Weiss–Potts (DCWP) model subject to a Glauber dynamics. The model is a randomly disordered version of the mean-field $q$-spin Potts model (CWP), where the interaction coefficients between spins are general independent random variables. These random variables are chosen to have fixed mean (for simplicity taken to be $1$) and well defined cumulant generating function, with a fixed distribution not depending on the number of particles. The system evolves as a discrete-time Markov chain with single spin flip Metropolis dynamics at finite inverse temperature $\beta$. We provide a comparison of the metastable behaviour of the CWP and DCWP models, when $N \to \infty$. First, we establish the metastability of the CWP model and, using this result, prove metastability for the DCWP model (with high probability). We then determine the ratio between the metastable transition time for the DCWP model and the corresponding time for the CWP model. Specifically, we derive the asymptotic tail behavior and moments of this ratio. Our proof combines the potential-theoretic approach to metastability with concentration of measure techniques, the latter adapted to our specific context.

23.
arXiv (CS.AI) 2026-06-19

Confidence-Aware Automated Assessment of Student-Drawn Scientific Models

arXiv:2606.20264v1 Announce Type: new Abstract: Student-generated drawings are widely used in science education to assess learners' conceptual understanding in modeling-based tasks aligned with the Next Generation Science Standards (NGSS). However, scoring such drawings requires expert human judgment to interpret complex visual representations, making large-scale assessment costly to implement and sustain in classroom settings. In this work, we study automated scoring of student-generated scientific drawings using a vision-based model. We evaluate a Vision Transformer (ViT) with parameter-efficient adaptation and propose a confidence-aware scoring framework that derives response-level confidence from test-time predictive distributions. This confidence signal enables selective automation by scoring high-confidence responses automatically while deferring uncertain cases for human review. Experiments on six NGSS-aligned middle school assessment items show that the proposed approach improves scoring reliability while supporting a practical trade-off between automated coverage and scoring risk, highlighting the value of confidence-aware methods for trustworthy educational assessment.

24.
Nature (Science) 2026-06-17

Visualizing the impact of quenched disorder on 2D electron Wigner solids

Authors:

Electron Wigner solids (WSs)1–12 provide an ideal system for understanding the competing effects of electron–electron and electron–disorder interactions, a central unsolved problem in condensed matter physics. Progress in this topic has been limited by a lack of single-defect-resolved experimental measurements as well as accurate theoretical tools to enable realistic experiment/theory comparison. Here we overcome these limitations by combining atomically resolved scanning tunnelling microscopy (STM) with neural-quantum-state quantum Monte Carlo (NQS-QMC) simulation of disordered 2D electron WSs to discover new disorder-induced physical regimes of correlated electron behaviour. STM was used to image the electron density (ne)-dependent evolution of electron WSs in gate-tunable bilayer MoSe2 (BL-MoSe2) devices with varying long-range (nLR) and short-range (nSR) disorder densities. These images were compared with NQS-QMC simulations using realistic disorder maps extracted from experiment, thus allowing the roles of different disorder types to be disentangled. We identify two distinct physical regimes for disordered electron WSs that depend on nSR. For nSR ≲ ne, the WS behaviour is dominated by long-range disorder and features extensive mixed solid–liquid phases, a new type of local re-entrant melting/crystallization and prominent Friedel oscillations. By contrast, when nSR ≫ ne, these features are suppressed and a more robust amorphous WS phase emerges that persists to higher ne, highlighting the importance of short-range disorder in this regime. Our work establishes a powerful framework for studying disordered quantum solids through a combined experimental–theoretical approach. A technique combining atomically resolved scanning tunnelling microscopy with neural-quantum-state quantum Monte Carlo simulation of disordered 2D electron Wigner solids establishes a powerful framework to enable the clear identification of two distinct defect-induced disorder regimes.

25.
arXiv (CS.LG) 2026-06-15

DRIVE: Distributional and Retrieval-Augmented Bidding with Value Evaluation

arXiv:2606.14192v1 Announce Type: new Abstract: Auto-bidding is a core component of real-time advertising systems, where decisions must optimize long-term performance under budget and cost constraints, while online exploration is prohibitively risky. Offline reinforcement learning and, more recently, Transformer-based sequence modeling have shown promise for learning bidding policies from logged data, but their unimodal and purely parametric formulations often collapse multiple effective bidding strategies into suboptimal averaged actions and perform unreliably under sparse or long-tail traffic. To mitigate these limitations, we propose DRIVE (Distributional and Retrieval-Augmented Bidding with Value Evaluation), a unified Transformer-based framework that decouples candidate action generation from decision making for offline auto-bidding. DRIVE combines distributional action modeling, retrieval-augmented candidate generation from high-quality historical decisions, and value-based evaluation to select the most promising bid at inference time. Extensive experiments on AuctionNet and additional offline reinforcement learning benchmarks demonstrate that DRIVE consistently improves bidding performance and generalizes well across multiple Transformer-based methods.