Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-11

The $K$-th nearest neighbor random walk on a Poisson point process gets trapped

arXiv:2606.11271v1 Announce Type: new Abstract: The $K$-th nearest neighbor random walk $(X_n)_{n \geq 0}$ on a homogeneous Poisson point process $\chi$ on $\R^d$ ($d\geq 1$), starts at the origin and at each step picks its next Poisson point among its closest neighbors according to i.i.d. labels having the same distribution as $K$. Our main result (Theorem 1) states that the number of Poisson points visited by $(X_n)_{n \geq 0}$ admits an exponential decay whenever the random variable $K$ has a bounded support (BS). In particular, the $K$-th nearest neighbor random walk visits finitely many Poisson points if and only if $K$ satisfies Assumption (BS). To prove it, we introduce the key notion of pioneer point which allows us to deal with the region of $\R^d$ already explored by $(X_n)_{n \geq 0}$. Still under Assumption (BS), we also prove an exponential decay for the Euclidean length of the trajectory performed by $(X_n)_{n \geq 0}$ (Theorem 2). Finally, and quite surprisingly, we exhibit an example of label distribution with bounded support for which the $K$-th nearest neighbor random walk discovers new Poisson points after a number of steps whose tail distribution is at least polynomial (Theorem 3).

02.
arXiv (CS.LG) 2026-06-19

MolGraphBench: A Benchmark of GNN Architectures for Molecular Regression Tasks

arXiv:2602.20573v3 Announce Type: replace Abstract: Molecules are often represented as SMILES strings, which can be readily converted to hand-crafted descriptors or fingerprints (FP) for molecular property prediction. Research has demonstrated that SMILES can be converted to molecular graphs $G = (V, E)$, with atoms as nodes $(V)$ and bonds as edges $(E)$. These molecular graphs can subsequently be used to train graph neural networks (GNN) models. Despite the recent surge in application of GNN (existing and novel architectures) for molecular property prediction, a rigorous benchmark is still lacking. We propose MolGraphBench, a comprehensive benchmark of four commonly used GNN models for molecular property prediction. Benchmarking results demonstrate graph convolutional network (GCN) and graph isomorphism networks (GIN) as the optimal GNN architectures for molecular graph regression tasks, based on absolute performance, training efficiency, transfer learning and prediction quality. The study also indicates the non-complementary nature of molecular fingerprints in the fusion (GNN-FP) framework. Furthermore, our GNN models achieved performance superior or comparable performance to current state-of-the-art GNN baselines across three datasets (GCN with RMSE of $0.518$ on B3DB, GIN-FP with RMSE of $1.022$ on FreeSolv and GIN with MAE of $63.783$ on RT datasets). Findings from this study indicate that type of GNN-layer, should be treated as a tunable hyperparameter rather than a fixed design choice to achieve superior performance.

03.
arXiv (math.PR) 2026-06-17

The Loss of Tension in an Infinite Membrane with Holes of Decaying Spatial Density

arXiv:2606.17792v1 Announce Type: new Abstract: What is the effect of randomly removing material from an infinite stretched membrane? Under what conditions can the membrane still sustain tension? This problem was introduced by Robert Connelly in connection with applications of rigidity theory in the natural sciences, and was later studied in M. V. Menshikov, K. A. Rybnikov, and S. E. Volkov, "The loss of tension in an infinite membrane with holes distributed according to a Poisson law" (2002); a discrete version was also considered in Robert Connelly, Konstantin Rybnikov, and Stanislav Volkov, "Percolation and the Loss of Tension in an Infinite Triangular Lattice" (2001). We study a mathematical framework based on a non-homogeneous Poisson point process whose intensity $\lambda$ tends to zero at infinity. The hole shapes are i.i.d.\ and independent of their locations. We show that if the intensity does not decay too quickly, then tension is still lost throughout the whole plane, as in the homogeneous model studied in 2002. Conversely, we give sufficient conditions under which complete loss of tension does not occur. Thus, both destruction and non-destruction regimes are possible even when the intensity tends to zero, indicating a phase transition in the model. The processes studied here are closely related to bootstrap percolation.

04.
arXiv (CS.AI) 2026-06-18

Controllable Quantum Memory Capacity in Quantum Reservoir Networks with Tunable partial-SWAPs

arXiv:2605.12713v3 Announce Type: replace-cross Abstract: In the field of quantum reservoir computing (QRC), many different computational models and architectures have been proposed. From these models, we identify feedback-based models – which use a feedback mechanism to re-embed classical measurements from the QRC – and recurrent models – which use a multi-register approach with memory and readout qubits – as the two major competing architectures that have been discussed and validated on hardware. In this paper, we advance upon the recurrent architectures, which employ a two register approach to endow the QRC with a fading memory. While these approaches have been validated on hardware and have demonstrated great real-world performance on noisy-intermediate-scale-quantum (NISQ) quantum processing units (QPUs), the exact mechanism through which the memory capacity arises is not completely understood or fully controllable. With this, we augment the recurrent approaches and present a hardware-realizable mechanism, which we call a tunable partial-SWAP, that allows for the direct control of the rate of memory dissipation from a QRN implemented on a gate-based QPU. The theory behind this mechanism is discussed in terms of a controlled amplitude-damping channel and validation experiments using a randomized short-term memory capacity (STMC) recall benchmark and the NARMA-5 dataset are conducted using simulation and IBM QPUs, respectively.

05.
arXiv (CS.CL) 2026-06-17

PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience

As Large Language Model based agents enter autonomous scientific research, their ability to resist pseudoscience becomes increasingly important. Otherwise, such systems may rapidly generate plausible yet misleading studies that contaminate academic literature and erode trust in science. We present PseudoBench, an adversarial benchmark for evaluating whether agentic auto-research systems can identify and resist pseudoscientific narratives. PseudoBench contains 200 curated pseudoscientific claim-evidence pairs across five domains and evaluates agents through an end-to-end research pipeline from experiments to writing. Testing seven state-of-the-art agents, we find that current systems readily produce persuasive reports that align with pseudoscientific premises with near-zero refusal rates and the highest resistance of only 27.4%. Stronger agents risk packaging pseudoscience in more sophisticated scientific language, increasing its apparent credibility. These findings reveal an alarming capacity to fuel pseudoscience, calling for scientific alignment before widespread deployment.

06.
arXiv (quant-ph) 2026-06-24

Thermodynamics of quantum processes: An operational framework for free energy and reversible athermality

arXiv:2510.12790v4 Announce Type: replace Abstract: We explore the thermodynamics of quantum processes (quantum channels) by axiomatically introducing the free energy for channels, defined via the quantum relative entropy with an absolutely thermal channel whose fixed output is in equilibrium with a thermal reservoir. This definition finds strong support through its operational interpretations in designated quantum information and thermodynamic tasks. We construct a resource theory of athermality for quantum processes, where free operations are Gibbs preserving superchannels and golden units are unitary channels with respect to absolutely thermal channel having fully degenerate output Hamiltonian. We exactly characterize the one-shot distillation and formation of quantum channels using hypothesis-testing and max-relative entropy with respect to the absolutely thermal channel. These rates converge asymptotically to the channel free energy (up to a multiplicative factor of half the inverse temperature), establishing its operational meaning and proving the asymptotic reversibility of the athermality. We show the direct relation between the resource theory of athermality and quantum information tasks such as private randomness and purity distillation, and thermodynamic tasks of erasure and work extraction. Our work connects the core thermodynamic concepts of free energy, energy, entropy, and maximal extractable work of quantum processes to their information processing capabilities.

07.
arXiv (CS.CL) 2026-06-11

Adaptive Multi-Resolution Procedural Knowledge Compression for Large Language Models

Large language models (LLMs) are widely used to tackle complex tasks with autonomous workflows. Recently, reusable natural language skills have emerged as a popular paradigm to inject procedural knowledge into LLM applications. Since popular skills are often invoked repeatedly, placing their full text in every context significantly increases prefill cost and latency. While text compression techniques have the potential to solve this problem, most existing methods are designed to compress factual knowledge in documents instead of procedural knowledge, making them insufficient for skill compression. In this paper, we argue that an effective skill compression method should: 1) preserve logical dependencies among workflows and tool protocols, 2) enable lightweight, offline compression for frequently updated community skills, and 3) be adaptable to varying complexities across skills. To address this, we present SKIM (SKIll coMpression), an adaptive multi-resolution soft token compression framework for procedural skills. Depending on the complexity of each skill, SKIM creates different numbers of soft tokens that not only improve the efficiency of LLM inference, but also preserve the effectiveness of skill usage. Experiments indicate that SKIM compresses skills to 30 to 60 percent of their original token length while preserving task performance better than existing compression methods.We have released our code at https://github.com/bebr2/SKIM .

08.
arXiv (CS.AI) 2026-06-12

Will AI Agents Free Us From Meaningless Work? A Human-Centered Analysis

arXiv:2606.12430v1 Announce Type: cross Abstract: Some claim that AI agents will free workers from the boring parts of their jobs, yet little is known about how workers themselves identify which tasks should be automated. Prior research focuses on occupations, overlooking that workers experience varying levels of meaning across tasks within the same role. We address this gap with a task-level analysis grounded in Graeber's theory of bullshit jobs. Using ratings from 202 workers on 171 workplace tasks, we (1) validate a five-item scale of perceived bullshitness, (2) show that perceived bullshitness strongly predicts desire for AI delegation, and (3) find that such tasks are also seen as requiring less human oversight. Together, these findings suggest that tasks perceived as bullshit are natural candidates for AI delegation, aligning worker preferences with perceived feasibility.

09.
arXiv (CS.CV) 2026-06-11

Q-Fold: Query-Aware Focus-Context Spatio-Temporal Folding for Long Video Understanding

Long-video understanding remains challenging for multimodal large language models, because temporally extended videos often contain thousands of frames and are therefore expensive to process exhaustively. Existing methods usually construct compact visual inputs from long videos under a limited visual budget. However, most of them still follow a frame-centric paradigm and apply similar representations to retained content regardless of its importance. This makes it difficult to preserve both high-fidelity visual evidence and broad temporal coverage. To address this issue, we propose Q-Fold, a training-free input construction framework for long-video understanding. Instead of treating isolated frames as the basic modeling unit, Q-Fold operates on contiguous temporal segments and constructs a heterogeneous Focus–Context representation under query guidance. Query-relevant segments are preserved as high-fidelity Focus Frames, while less relevant segments are folded into chronology-preserving contextual layouts. In this way, Q-Fold preserves critical visual evidence and broad temporal coverage, while better maintaining local temporal continuity within short segments. Experiments on four long-video benchmarks with multiple Video-MLLMs show that Q-Fold consistently improves performance without increasing the input budget. Notably, it achieves gains of up to 9.1 percentage points on an ultra-long video benchmark. Code will be made publicly available.

10.
arXiv (CS.CL) 2026-06-15

MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft

Multimodal large language models (MLLMs) have shown strong capabilities in perception, reasoning, and action generation. However, their ability to sustain exploration in dynamic open worlds remains unclear. Existing embodied and game-based benchmarks often compress interaction into short-horizon tasks or entangle success with domain-specific game mechanics. In this paper, we introduce MineExplorer benchmark for evaluating open-world exploration capabilities of MLLM agents in Minecraft. We first filter atomic tasks whose solutions rely heavily on Minecraft-specific knowledge to better reflect general open-world reasoning. Then we organize the benchmark around a ReAct-style capability formulation and compose atomic tasks into implicit multi-hop tasks. To further construct reliable instances, MineExplorer uses a multi-agent synthesis workflow that jointly designs task graphs, sandbox scenes, and rule-based milestone evaluators. Human evaluation shows that the multi-agent synthesis workflow produces significantly more reliable instances than a single-agent baseline. Experiments with advanced MLLM agents show that open-world exploration remains challenging, as strong models can handle many single-hop tasks but degrade sharply when hidden prerequisites must be coordinated over longer trajectories. Further analysis finds that task difficulty tracks agent completion, and larger models or thinking modes do not consistently translate into better performance. Code and dataset are available at https://github.com/Jometeorie/MineExplorer.

11.
arXiv (CS.CV) 2026-06-11

Semantically-Aware Diver Activity Recognition Framework for Effective Underwater Multi-Human-Robot Collaboration

Effective multi-human-robot collaboration is essential for expanding human-led operations in the challenging and high-risk underwater environment. For autonomous underwater vehicles (AUVs) to become true teammates, they must be able to comprehend their surroundings and recognize a diver's activities to offer assistance and ensure safety. Towards this goal, we introduce DAR-Net, a novel transformer-based framework that analyzes complex underwater scenes to classify diver activities. Our contribution lies in a semantically guided learning formulation that couples transformer-based temporal reasoning with pixel-level scene supervision. This multi-loss training strategy explicitly aligns global activity recognition with local human-robot interaction semantics, which is particularly critical in low-visibility underwater conditions. To address the significant challenge of data scarcity in this domain, we present the first-ever Underwater Diver Activity (UDA) dataset, a foundational resource containing over 2,600 annotated images with pixel-level masks. Through rigorous experimental evaluations in a controlled environment, we demonstrate that DAR-Net achieves promising accuracy in recognizing six distinct diver activities, outperforming state-of-the-art models. While this dataset provides a crucial baseline, our work serves as a pioneering step, laying the groundwork for future research and facilitating the development of more intelligent, collaborative underwater robotic systems.

12.
arXiv (CS.CV) 2026-06-24

ViTexQA: A Multi-Frame Temporal Perception Dataset for Video Text Question Answering

Despite remarkable progress in multimodal understanding, current MLLMs still exhibit limitations in video text understanding, particularly when semantics emerge through the integration of temporally distributed textual cues across multiple frames. This perception challenge fundamentally differs from static image text understanding, yet existing datasets fail to capture: the vast majority of questions remain answerable from single frames, inadequately reflecting real-world video text comprehension demands. To address this, we present ViTexQA, a large-scale video-text QA dataset, and FrameThinker for robust multi-frame temporal reasoning. We build ViTexQA via a quality-controlled Chain-of-Thought (CoT) annotation pipeline boosted with temporal constraints; all its QA pairs demand cross-frame text fusion to solve, enforcing true temporal reliance. FrameThinker adopts two-stage training for explicit temporal modeling: CoT-Guided Supervised Fine-Tuning (SFT) generates frame-aware reasoning chains, followed by Temporally-grounded Reinforcement Learning (RL) optimized with multi-frame coherence rewards. Evaluations show our method outperforms SOTA baselines on ViTexQA, lifting ROUGE-L by 6.3%.

13.
arXiv (CS.AI) 2026-06-24

FlowR2A: Learning Reward-to-Action Distribution for Multimodal Driving Planning

arXiv:2606.24231v1 Announce Type: new Abstract: Multimodal driving planning faces a long-standing tension between two paradigms: scoring-based methods benefit from dense reward supervision but are confined to a fixed action vocabulary, while anchor-based methods generate proposals dynamically yet suffer from sparse supervision constrained to a single ground-truth trajectory. In this work, we propose FlowR2A, which resolves this tension by reframing simulation-based rewards from discriminative targets into generative conditions. By learning the reward-conditioned action distribution from dense trajectory-reward pairs with a flow-matching decoder, FlowR2A unifies the dense supervision of scoring-based methods with the proposal generation of anchor-based methods in a single generative model, forcing the model to internalize the correlation between an action and its outcomes in safety, progress, comfort, and rule compliance. To balance hard safety constraints against soft progress objectives, we introduce fine-grained per-timestep reward conditioning and reward noise augmentation. The generative formulation naturally supports controllable test-time sampling via reward guidance and anchored sampling, producing high-quality proposals. FlowR2A achieves state-of-the-art results on the NAVSIM v1 and v2 benchmarks, with multimodal proposals of substantially higher quality than prior methods.

14.
arXiv (CS.LG) 2026-06-17

Geometrical fairness in graph neural networks

arXiv:2606.17684v1 Announce Type: cross Abstract: Graph-based learning methods have become increasingly prominent due to their strong performance across diverse applications. Among these, recent frameworks grounded in diffusion processes provide a unifying perspective that extends traditional graph neural network formulations while addressing limitations of standard message-passing mechanisms. Despite these advances, concerns remain regarding the fairness of such models, as they may propagate or amplify biases present in the data. In this work, we introduce a fairness-aware adaptation of graph-based diffusion by modifying the underlying Laplacian operator. Our approach incorporates multiple complementary transformations, including subspace projections, spectral adjustments, and frequency-based filtering, to mitigate bias-related components. Leveraging the intrinsic smoothing properties of graph diffusion, we provide a principled analysis of the resulting behavior and establish theoretical insights into fairness properties. We evaluate the proposed framework on both synthetic and real-world datasets, demonstrating that it achieves competitive performance while improving fairness metrics with limited additional computational cost.

15.
arXiv (math.PR) 2026-06-16

Sharp One-Dimensional Sub-Gaussian Comparison in Convex Order

Authors:

arXiv:2604.26819v2 Announce Type: replace Abstract: We prove that any random variable $X$ whose moment generating function is point-wise upper bounded by that of $ G \sim \mathcal{N}(0,1) $ must be dominated by $ G/\mathbb{E}[|G|] $ in convex order, meaning $ \mathbb{E}[f(X)] \le \mathbb{E}[f(G/\mathbb{E}[|G|])] $ for all convex $f$. This is sharp as witnessed by $ X \sim \mathrm{Unif}(\{-1,1\}) $ and $ f(x) = |x| $.

16.
arXiv (CS.AI) 2026-06-18

Quality Perceptions and Intended Engagement in Response to AI-Generated and AI-Assisted News

arXiv:2409.03500v4 Announce Type: replace-cross Abstract: The increasing use of artificial intelligence (AI) in news production raises important questions about how audiences perceive and respond to AI-generated journalism. This preregistered survey experiment (N = 599, German-speaking Switzerland) examines (i) perceptions of article quality (measured as credibility, readability, and expertise) across news excerpts that were human-written, AI-assisted, or fully AI-generated, and (ii) self-reported intentions to engage following disclosure of AI involvement. Participants rated two short news excerpts before learning how they had been produced. Articles across all conditions were evaluated similarly in perceived quality. After disclosure, participants in the AI-assisted and AI-generated conditions reported a higher willingness to continue reading their assigned articles compared to the control group, but future willingness to read AI-generated news did not differ across conditions. Overall, the findings suggest that readers assess AI-generated and human-written news comparably in quality, while disclosure of AI use can momentarily increase curiosity or interest without yet changing longer-term reading intentions.

18.
arXiv (quant-ph) 2026-06-16

Entangled states are typically incomparable

arXiv:2406.03335v2 Announce Type: replace Abstract: Consider a bipartite quantum system, where Alice and Bob jointly possess a pure state $|\psi\rangle$. Using local quantum operations on their respective subsystems, and unlimited classical communication, Alice and Bob may be able to transform $|\psi\rangle$ into another state $|\phi\rangle$. Famously, Nielsen's theorem [Phys. Rev. Lett., 1999] provides a necessary and sufficient algebraic criterion for such a transformation to be possible (namely, the local spectrum of $|\phi\rangle$ should majorise the local spectrum of $|\psi\rangle$). In the paper where Nielsen proved this theorem, he conjectured that in the limit of large dimensionality, for almost all pairs of states $|\psi\rangle, |\phi\rangle$ (according to the natural unitary invariant measure) such a transformation is not possible. That is to say, typical pairs of quantum states $|\psi\rangle, |\phi\rangle$ are entangled in fundamentally different ways, that cannot be converted to each other via local operations and classical communication. Via Nielsen's theorem, this conjecture can be equivalently stated as a conjecture about majorisation of spectra of random matrices from the so-called trace-normalised complex Wishart-Laguerre ensemble. Concretely, let $X$ and $Y$ be independent $n \times m$ random matrices whose entries are i.i.d. standard complex Gaussians; then Nielsen's conjecture says that the probability that the spectrum of $X X^\dagger / \operatorname{tr}(X X^\dagger)$ majorises the spectrum of $Y Y^\dagger / \operatorname{tr}(Y Y^\dagger)$ tends to zero as both $n$ and $m$ grow large. We prove this conjecture, and we also confirm some related predictions of Cunden, Facchi, Florio and Gramegna [J. Phys. A., 2020; Phys. Rev. A., 2021].

19.
arXiv (CS.LG) 2026-06-19

Predicting gestational age at birth in the context of preterm birth from multi-modal fetal MRI

arXiv:2606.20172v1 Announce Type: new Abstract: Preterm birth is associated with significant mortality and a risk for lifelong morbidity. The complex multifactorial aetiology hampers accurate prediction and thus optimal care. A pipeline consisting of bespoke machine learning methods for data imputation, feature selection, and regression models to predict gestational age (GA) at birth was developed and evaluated from comprehensive multi-modal morphological and functional fetal MRI data from 333 control cases and 93 preterm birth cases. The GA at birth predictions were classified into term and preterm categories and their accuracy, sensitivity, and specificity were reported. An ablation study was performed to further validate the design of the pipeline. Performance was evaluated using stratified 10-fold cross-validation. The pipeline achieves an R2 score of 0.13 and a mean absolute error of 2.74 weeks. It also achieves a 0.77 accuracy, 0.59 sensitivity, and 0.82 specificity across folds. The predominant features selected by the pipeline include cervical length and statistics derived from placental T2* values. The confluence of fast, motion-robust and multi-modal fetal MRI techniques and machine learning prediction allowed the prediction of the gestation at birth. This information is essential for any pregnancy. To the best of our knowledge, preterm birth had only been addressed as a classification problem in the literature. Therefore, this work provides a proof of concept. Future work will increase the cohort size to allow for finer stratification within the preterm birth cohort. Our code is available at https://github.com/dfajardorojas/ml-for-preterm-birth-.

20.
arXiv (quant-ph) 2026-06-15

Multi-entropy in random tensor networks

arXiv:2606.04470v2 Announce Type: replace-cross Abstract: We study the evaluation of Rényi multi-entropies $S^{(q)}_n$ in Random Tensor Network (RTN) states in the large bond-dimension limit. For the case of Rényi index $n=2$ and arbitrary number of parties $q$, we prove that that multi-entropies are determined by minimal multiway cuts through the network. When the minimal multiway cut is degenerate, we characterize the full minimizer set via compatible families of minimal cuts and give a criterion for all minimizers to come from ordinary cut partitions. For $n=2$, this gives a natural generalization of the minimal cut description of bipartite entanglement to multipartite systems with arbitrarily many parties. For the case of integer $n>2$, we show that the minimal multiway cut conjecture is in general not true by providing explicit counter examples for both the single random tensor and for the network built from isometric tilings. We discuss the implication for our results on the multipartite entanglement structures in RTN and holography.

21.
arXiv (CS.AI) 2026-06-24

JEDEL: Zero-Shot DNA-Encoded Library Design for Early-Stage Drug Discovery

arXiv:2606.23745v1 Announce Type: cross Abstract: We present JEDEL, a framework for generating synthesis-ready DNA-encoded libraries (DELs) directly from three-dimensional pharmacophore representations of active ligands. JEDEL is the first model to map pharmacophore interaction patterns to actionable, scalable synthesis instructions, enabling the design of targeted libraries comprising potentially millions of molecules. Unlike existing generative approaches that produce virtual compounds requiring downstream synthesis planning, JEDEL operates within the space of purchasable building blocks and validated reactions, ensuring that every output is experimentally realizable by construction. JEDEL learns a predictive alignment between pharmacophore geometry and molecular structure and decodes this into combinatorial synthesis routes at scale. Across 18 protein targets, it generates focused libraries that outperform random and diversity-based baselines in predicted binding affinity, pharmacophore recovery, and sample efficiency, without target-specific retraining. JEDEL enables a shift from virtual molecule generation to experimentally deployable library design.

22.
arXiv (CS.LG) 2026-06-11

Intermittent time series forecasting: local vs global models

arXiv:2601.14031v2 Announce Type: replace-cross Abstract: Forecasting intermittent time series, which contain zeros, is a crucial challenge in supply chains as inventory policies require probabilistic forecasts to establish safety levels. Intermittent time series are commonly forecast using local models, trained individually on each time series. In the last years global models, trained on a large collection of time series, have become popular for time series forecasting. Global models are often based on neural networks or gradient boosted trees. We carry out the first study comparing state-of-the-art probabilistic local and global models on intermittent time series. For global models we consider three different distribution heads suitable for intermittent time series: negative binomial, hurdle-shifted negative binomial and Tweedie. To the best of our knowledge, this is the first use of the latter two with neural networks. We perform experiments on five datasets comprising overall more than 40'000 real-world time series. Among global models, TiDE, a simple neural network architecture, achieves the best accuracy; it also consistently outperforms local models and has lower computational requirements. Large global models are instead much more computationally demanding and less accurate. Among the distribution heads, the Tweedie provides the best estimates of the highest quantiles.

23.
arXiv (CS.CL) 2026-06-19

Before the Labels: How Dataset Construction Shapes Suicidality Detection in Clinical Text

Clinical NLP increasingly relies on electronic health record (EHR) data to detect suicidal behaviors, treating clinical documentation as more reliable ground truth than social media. We argue that this framing obscures how EHR-based suicidality datasets encode a particular operationalization of suicidality, shaped by who authors the data, how episodes are bounded, and how ambiguity is resolved. We ground this argument in a case study of the ScAN dataset, built over MIMIC-III clinical notes. We show how governance constraints, ICD-based cohort selection, single-annotator labeling, and hospital-stay-level aggregation produce labels that reflect clinician-documented judgments, treat suicidality as a bounded episode, and assume that intent can be reliably inferred from documentation. A linguistic analysis demonstrates that identical labels subsume heterogeneous clinical framings differing in temporality, negation, and uncertainty. We argue that clinical NLP should examine the assumptions embedded in suicidality datasets before interpreting their labels as ground truth.

24.
arXiv (math.PR) 2026-06-24

Autoregressive Processes on Riemannian Manifolds

arXiv:2606.24771v1 Announce Type: cross Abstract: This paper introduces a Riemannian autoregressive (R-AR) model of order one, generalising classical discrete-time stochastic processes to manifold-valued data. The model is based on two parameters, a parameter $\mu$ representing the intrinsic central tendency as the Fréchet mean and an autoregressive parameter $\phi$ controlling the stationarity and ergodic properties. Due to the inherent dependence structure of the R-AR process, the estimation procedure for these parameters necessitates new asymptotic results for dependent processes on manifolds. Thus, we establish a strong law of large numbers for the sample Fréchet mean set of ergodic Markov chains in proper metric spaces. By proving this general consistency result, we move beyond the limitations of classical i.i.d. theory to provide the mathematical foundation required for the strong consistency of our proposed estimators. The framework is validated through numerical simulations in the hyperbolic plane and an application to aerosol size distributions on the Fisher-Rao manifold, demonstrating how the proposed model can characterise mean-reverting dynamics in nonlinear geometries.

25.
arXiv (CS.CL) 2026-06-15

The Linguistics Olympiads: Towards a New Corpus for Linguistics Research?

Linguistics olympiad problems (LOPs) are a category of self-sufficient puzzles consisting of a scaled-down corpus representative of certain linguistic phenomena, from which the solver must deduce a primitive set of rules of the language and then translate a new set of elements. The linguistics olympiads (LOs) have become a worldwide phenomenon with 43 different territories taking part in the International Linguistics Olympiad (IOL) 2025. While the typology and solving strategies of LOPs have been analysed, their scientific facet and connections to academic linguistics have yet to be explored. LOPs are directly connected to many linguistic fields, e.g., linguistic typology, linguistic relativity, and linguistics fieldwork. Recently, LOPs have become a research focus as benchmarks for large language models, thus highlighting their usefulness in computational linguistics. Nevertheless, they have not yet been integrated into mainstream linguistics research. This paper attempts to open new directions of including this particular type of puzzle in academic research by offering a structured evaluation of LOPs as linguistic data sources and proposes criteria for their responsible use in academic research. Starting from a set of over 1800 LOPs, this study critically examines the potential of LOPs as a novel corpus for linguistics research by discussing their strengths and limitations as tools, as well as the areas of linguistics into which these problems could fit. This work forms the foundation for a broader initiative aimed at bridging the gap between LOs and academic linguistics, by establishing a robust theoretical framework for LOPs.