Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-15

Free Heavy-Tailed Lunch for Muon: A Theoretical Justification of Empirical Success

arXiv:2606.14560v1 Announce Type: cross Abstract: Non-Euclidean optimisation methods with matrix-valued updates, such as Muon and Scion, have recently shown strong empirical performance for training Transformer models, yet their theoretical advantages over Euclidean methods remain poorly understood. We address this gap in the heavy-tailed non-convex regime, where stochastic gradients have bounded $p$-th central moments, $p \in (1,2]$. We show that certain non-Euclidean methods achieve optimal sample complexity under stronger stationarity measures, while Euclidean methods incur additional dimension-dependent costs. As a consequence, for $m \times n$ matrices, Muon finds an $\varepsilon$-stationary point in nuclear norm within $\mathcal{O}\left(\min\{m, n\} \frac{\Delta_1 L}{\varepsilon^2} \left(\frac \sigma \varepsilon \right)^{\frac p {p-1}}\right)$ samples, absorbing heavy-tailed noise without extra dimension dependence, unlike Euclidean methods. We further prove this sample complexity, including its dimension dependence, is optimal for all first-order methods under nuclear-norm stationarity. Experiments on large language models support our theory. Surprisingly, our results suggest that other Schatten geometries beyond the spectral geometry of Muon can perform competitively in certain settings.

02.
arXiv (CS.CV) 2026-06-17

Fluently Lying: Adversarial Robustness Can Be Substrate-Dependent

The primary tools used to monitor and defend object detectors under adversarial attack assume that when accuracy degrades, detection count drops in tandem. This coupling was assumed, not measured. We report a counterexample observed on a single model: under standard PGD, EMS-YOLO, a spiking neural network (SNN) object detector, retains more than 70% of its detections while mAP collapses from 0.528 to 0.042. We term this count-preserving accuracy collapse Quality Corruption (QC), to distinguish it from the suppression that dominates untargeted evaluation. Across four SNN architectures and two threat models (l-infinity and l-2), QC appears only in one of the four detectors tested (EMS-YOLO). On this model, all five standard defense components fail to detect or mitigate QC, suggesting the defense ecosystem may rely on a shared assumption calibrated on a single substrate. These results provide, to our knowledge, the first evidence that adversarial failure modes can be substrate-dependent.

03.
medRxiv (Medicine) 2026-06-11

Dissecting the functional landscape of rare diseases through genomic variation in a heterogeneous cohort of 11,000 patients

Rare diseases (RDs) remain a major diagnostic challenge. Genetic and phenotypic heterogeneity, incomplete knowledge of disease mechanisms, and limitations in variant clinical interpretation leave many patients without a molecular diagnosis. Meanwhile, the growing volume of genomic data generated in clinical practice offers an opportunity to develop data-driven methodologies for exploring disease mechanisms and improving the reanalysis of unsolved cases. We aggregated real-world genomic data from 11,084 unrelated patients with suspected RD. Patients were clinically classified into 122 diseases. We built a multi-disease genomic variant frequency database (FJD-DB), which enabled the development of variant and gene-disease association scores by means of case-control subcohort comparisons across 32 disease groups. Functional enrichment analyses were then used to highlight disease-associated protein domains, pathways, biological processes, and phenotypes. Finally, the resulting knowledge was integrated into a data-driven framework for the guided reanalysis of unsolved RD patients applied to Inherited Retinal Dystrophies (IRD) patients as first use case. FJD-DB contained more than 45 million unique variants, including ~185,000 potentially pathogenic variants. Disease-specific analyses identified disease-associated pathogenic variants and highlighted both established and candidate disease genes. We detected 179 significantly enriched protein domains across 23 diseases, 124 Human Phenotype Ontology terms across 13 diseases, 79 Reactome pathways across 10 diseases, and 72 Gene Ontology biological processes across 8 diseases, revealing highly disease-specific functional signatures. Integration of disease-specific variant, gene, and functional association signals enabled the development of a data-driven framework for guided reanalysis of unsolved RD cases. Applied to more than 1,100 unsolved IRD cases, the framework generated clinically relevant findings in 26 patients, including four molecular diagnoses, seven candidate diagnoses, and 15 cases upgraded from non-informative findings to variants of uncertain significance. Aggregated real-world genomic data can be leveraged to identify disease-associated molecular signals generating novel biological hypotheses. A unified analytical framework provides a scalable strategy for knowledge discovery and guided reanalysis, facilitating the identification of overlooked and potentially novel genetic causes of RDs.

04.
arXiv (CS.CV) 2026-06-15

Generation of Maximal Snake Polyominoes Using a Deep Neural Network

Maximal snake polyominoes are difficult to study numerically in large rectangles, as computing them requires the complete enumeration of all snakes for a specific rectangle size, which corresponds to a brute force algorithm. This hinders the study of maximal snakes in larger rectangles. Moreover, most enumerable snakes lie in small rectangles, obscuring large-scale patterns. In this paper, we investigate the contribution of a deep neural network to the generation of maximal snake polyominoes from a data-driven training, where the maximality and adjacency constraints are not encoded explicitly, but learned. To this extent, we experiment with a denoising diffusion model, which we referred as Structured Pixel Space Diffusion (SPS Diffusion). We find that SPS Diffusion generalizes from small rectangles to larger ones, generating valid snakes up to 28x28 squares and producing maximal snake candidates on squares close to the current computational limit. The model is, however, prone to errors such as branching, cycles, or multiple snake components. Overall, the diffusion model is promising and suggests that complex combinatorial objects can be understood by deep neural networks, which is useful in their investigation.

05.
arXiv (CS.CV) 2026-06-17

HRDX: A Large-Scale Vector HD-Map Dataset

Reliable autonomous driving requires vectorized HD maps that are geometrically accurate, semantically rich, and scalable to long-horizon driving. However, existing public HD map datasets are limited in scale, provide sparse semantic attributes, and lack modalities such as aerial imagery that could enable new research directions. We present HRDX, a large-scale dataset for vector HD-map construction, spanning about 40 hours (1,400 km) of minimally overlapping drives, which is several times larger than prior public HD map datasets. Data is captured using six synchronized surround cameras, a 128-beam LiDAR, and centimeter-level RTK GNSS/IMU, and is further complemented by precisely aligned aerial orthoimagery. Annotations cover 10 vector map classes, complemented with over 20 semantic and topological attributes. To evaluate this richer ontology, we introduce the Composite Score (CS) to jointly assess geometric fidelity and attribute correctness. Benchmark experiments show that HRDX's scale improves online vector-map construction, and that aligned aerial imagery provides a useful structural prior: using aerial imagery at training and/or inference improves geometric map quality, while aerial-augmented teachers can transfer part of this benefit to camera-only students without increasing inference-time sensor requirements. HRDX is intended to support reproducible research on large-scale HD-map learning, multimodal BEV fusion, and training-time privileged information. HRDX dataset and benchmarks are available at https://github.com/honda-research-institute/HRDX

06.
arXiv (CS.LG) 2026-06-12

Clustering Node Attributed Networks with Graph Neural Networks and Self Learning

arXiv:2606.13444v1 Announce Type: new Abstract: Graph clustering - partitioning the node set of a graph into disjoint subsets that reflect some latent information - is a fundamental problem as it finds applications in a myriad of different scenarios. While this classic problem has been tackled for decades by different communities, a recent variation of the problem driven by real data considers the scenario where nodes have attributes that are also informative. This has triggered novel methods that simultaneously leverage network information (edges) and node information (attributed) in the design of novel clustering algorithms. This work proposes a novel framework that builds on prior works that have applied graph neural networks (GNN) to graph clustering. The proposed framework operates in rounds of self learning in a fully unsupervised setting. In each round, a GNN generates representations for nodes that are used to cluster the nodes. This clustering influences the graph used to generate the node representation in the next round. Moreover, a context graph built in each round using the original graph is used to generate the node representations. Empirical results show that the proposed methodology extracts information from both network edges and node attributes in synthetic data, outperforming algorithms focused solely on the network or attributes when neither are very informative. Multiple rounds of learning also improve the performance and always outperforms a long single round of training (i.e., classic GNN graph clustering). When considering real datasets, empirical results indicate that the proposed methodology is competitive to state-of-the-art methods when cluster sizes are balanced.

07.
arXiv (CS.LG) 2026-06-19

CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training

arXiv:2510.18784v3 Announce Type: replace Abstract: Despite significant work on low-bit quantization-aware training (QAT), there is still an accuracy gap between such techniques and native training. To address this, we introduce CAGE (Curvature-Aware Gradient Estimation), a new QAT method that augments the straight-through estimator (STE) gradient with a curvature-aware correction designed to counteract the loss increase induced by quantization. CAGE is derived from a multi-objective view of QAT that balances loss minimization with the quantization constraints, yielding a principled correction term that depends on local curvature information. On the theoretical side, we introduce the notion of Pareto-optimal solutions for quantized optimization, and establish that CAGE yields strong convergence guarantees in the smooth non-convex setting. In terms of implementation, our approach is optimizer-agnostic, but we provide a highly-efficient implementation that leverages Adam statistics. CAGE significantly improves upon the prior state-of-the-art methods in terms of accuracy, for similar computational cost: for QAT fine-tuning, it halves the compression accuracy loss relative to the prior best method, while for QAT pre-training of Llama models, its accuracy for 3-bit weights-and-activations (W3A3) matches the accuracy achieved at 4-bits (W4A4) with the prior best method. The official implementation can be found over https://github.com/IST-DASLab/CAGE .

08.
arXiv (CS.AI) 2026-06-16

On-Policy Distillation with Curriculum Turn-level Guidance for Multi-turn Agents

arXiv:2606.15912v1 Announce Type: cross Abstract: Multi-turn agents that plan, invoke tools, and interact with environments offer a promising paradigm for solving complex tasks, yet their capabilities typically rely on very large models whose inference cost is prohibitive in practice.On-Policy Distillation (OPD) is a natural recipe for transferring such capabilities to smaller students, but we find that it suffers a characteristic failure mode in this setting: small student errors compound across turns and push the trajectory out of the teacher's familiar state distribution, so the teacher's supervision becomes least reliable precisely where the student needs it most.We propose Guided On-Policy Distillation (Guided-OPD), a simple yet effective algorithm that mixes teacher- and student-generated turns within each rollout and schedules the teacher's intervention probability along a curriculum that decays to zero.Strong guidance keeps early trajectories close to the teacher distribution and is then gradually withdrawn to recover the purely on-policy regime used at inference.On ALFWorld, ScienceWorld, and WebShop, distilling Qwen3 students from a Qwen3-30B-A3B teacher, Guided-OPD improves Score by 21.1\% and Success Rate by 25.5\% over vanilla OPD on average, with larger gains on smaller students.

09.
arXiv (CS.LG) 2026-06-11

Conformal Bayes under Label Shift: Post-Hoc Calibration vs. In-Training Adaptation

arXiv:2606.11865v1 Announce Type: cross Abstract: Conformal Bayes combines Bayesian posterior predictives with conformal calibration to produce prediction sets that are both statistically valid and geometrically efficient. We study conformal Bayes under label shift from a unified perspective, identifying two complementary approaches that restore nominal target-domain coverage through importance-weighted conformal calibration but operate through independent mechanisms. Post-hoc calibration tilts the posterior predictive toward the target domain and corrects the conformal threshold via an importance-weighted quantile, leaving the parameter posterior unchanged. In-training adaptation tilts the parameter posterior itself to the target domain, producing a corrected predictive whose highest predictive density region serves as the highest predictive density (HPD) based prediction set under the fitted target predictive; efficiency is model-dependent and does not imply finite-sample conditional optimality. Two controlled experiments show that in an unbiased training regime both strategies achieve valid coverage equally, while in a lead-optimization regime in-training adaptation acts as a debiasing operator, reducing interval width at unchanged coverage.

10.
arXiv (math.PR) 2026-06-15

Boltzmann-Like Occupation of Nonequilibrium Steady States on Dense Networks

arXiv:2606.14542v1 Announce Type: cross Abstract: A central problem in statistical physics is to extend the Boltzmann distribution to nonequilibrium steady states (NESS). We prove that NESS on large dense networks have Boltzmann-like occupation despite extensive entropy production. We further show that the active-matter heuristic of "low rattling" is asymptotically exact. Intuitively, these NESS spend a greater fraction of their time in states they leave more slowly. This explanation extends to the broader class of "equiaccessible" steady states, which play a role in our analysis akin to that of equilibrium in linear response.

11.
arXiv (CS.CV) 2026-06-18

SpectralDiT: Timestep-Conditioned Spectral Residual Correction for Flow-Matching DiTs

作者:

We propose SpectralDiT, a lightweight modification to flow-matching Diffusion Transformers that adds timestep-conditioned spectral correction to the MLP residual branch. The module decomposes each residual update into low- and high-frequency components on the patch-token grid, then learns a zero-initialized additive gate so the model initially matches the baseline DiT. On CIFAR-10 pixel-space generation, SpectralDiT improves FID from 20.78 to 19.71 at patch size 1 and reduces the radial Fourier spectrum gap. Furthermore, we scale our method to latent diffusion on ImageNet-100. With 0.6% additional theoretical FLOPs and 1.36% additional parameters, SpectralDiT improves latent flow-matching, achieving an 8.7% relative FID reduction under classifier-free guidance (CFG 2.0). All reported results are averaged over five seeds. Ablations and gate visualizations on CIFAR-10 reveal stable block-specific spectral correction patterns.

12.
arXiv (CS.LG) 2026-06-16

On the Role of Computation in Reinforcement Learning

arXiv:2602.05999v3 Announce Type: replace Abstract: How does the amount of compute available to a reinforcement learning (RL) policy affect its learning? Can policies using a fixed amount of parameters, still benefit from additional compute? The standard RL framework does not provide a language to answer these questions formally. Empirically, deep RL policies are often parameterized as neural networks with static architectures, conflating the amount of compute and the number of parameters. In this paper, we formalize compute bounded policies and prove that policies which use more compute can solve problems and generalize to longer-horizon tasks that are outside the scope of policies with less compute. Building on prior work in algorithmic learning and model-free planning, we propose a minimal architecture that can use a variable amount of compute. Our experiments complement our theory. On a set 31 different tasks spanning online and offline RL, we show that $(1)$ this architecture achieves stronger performance simply by using more compute, and $(2)$ stronger generalization on longer-horizon test tasks compared to standard feedforward networks or deep residual network using up to 5 times more parameters.

13.
arXiv (math.PR) 2026-06-16

Malliavin Calculus for the stochastic Cahn-Hilliard equation driven by fractional noise

arXiv:2601.10490v2 Announce Type: replace Abstract: The stochastic partial differential equation analyzed in this work is the Cahn-Hilliard equation perturbed by an additive fractional white noise (fractional in time and white in space). We work in the case of one spatial dimension and apply Malliavin calculus to investigate the existence of a density for the stochastic solution $u$. In particular, we show that $u$ admits continuous paths almost surely and construct a localizing sequence through which we prove that its Malliavin derivative exists locally, and that its law is absolutely continuous with respect to the Lebesgue measure on $\bf R$, establishing thus that a density exists. A key contribution of this work is the analysis of the stochastic integral appearing in the mild formulation: we derive sharp estimates for the expectation of the $p$-th power ($p \geq 2$) of the $L^{\infty}(D)$-norm of this stochastic integral as well as for the integral involving the $L^{\infty}(D)$-norm of the operator associated with the kernel appearing in the integral representation of the fractional noise, all of which are essential for this study.

14.
arXiv (quant-ph) 2026-06-15

No classical particle limit for massless quanta

arXiv:2606.14632v1 Announce Type: new Abstract: We investigate whether relativistic massless classical particles may emerge as the classical limit of massless quanta. To address this question independently of any specific dynamics, environment, or pointer basis, we develop an axiomatic and purely kinematical framework for the coarse-graining approach. In this formulation, a candidate classical phase space is taken as the outcome space of a POVM subject only to minimal classicality and covariance under the relevant spacetime symmetry group. Applying this framework to the Poincaré group, we prove a no-go theorem for massless particles: the covariance requirement is incompatible with the operational conditions for classicality. The theorem leaves open field-like limits of massless quanta, for example the emergence of electromagnetic or gravitational fields, while ruling out classical massless particles, such as classical photons or gravitons.

15.
arXiv (CS.LG) 2026-06-16

SPICE: Synergy and Partial Information Based Curriculum Evolution

arXiv:2606.16639v1 Announce Type: new Abstract: Multimodal learning exploits complementary information across heterogeneous modalities. The informativeness of each modality can vary widely across samples and training stages. Existing multimodal curriculum learning strategies often assume that the relative complexity of samples remains unchanged throughout training and therefore cannot adapt to model evolution. We propose SPICE (Synergy and Partial Information based Curriculum Evolution), a novel progressive curriculum framework for multimodal interaction learning. Guided by Partial Information Decomposition (PID) theory, our approach decomposes multimodal interactions into redundant, unique, and synergistic information components, enabling an interpretable and dynamic characterization of sample complexity. Building on this decomposition, we design a progressive curriculum that evolves throughout training, allowing the model to transition from learning shared cross-modal cues to modality-specific patterns and, finally, to complex synergistic interactions. Adapting to model evolution, sample ordering is refined in real-time using PID information estimates derived from unimodal and multimodal predictions. Experiments across multiple multimodal benchmarks demonstrate consistent improvements over conventional training and state-of-the-art baselines, highlighting the effectiveness of PID information decomposition and adaptive sample ordering for multimodal curriculum learning.

16.
arXiv (math.PR) 2026-06-17

Limit theorems for descents and inversions of shelf-shuffles

arXiv:2510.00343v2 Announce Type: replace Abstract: We prove central limit theorems for the number of descents and inversions of permutations produced by shelf-shuffles. These are a model for casino card shuffling machines. We show the asymptotic normality of the number of descents in two limiting regimes depending on the ratio of cards to shelves. On the other hand, we study the inversions by employing a modification of the techniques from Islak's analysis of the statistics of riffle shuffles. In particular, we obtain a bound for the rate of convergence for inversions that is independent of the number of shelves.

17.
arXiv (CS.LG) 2026-06-19

Matching Markets meet Cumulative Prospect Theory: Towards Optimal and Adversarially Robust Learning

arXiv:2606.19883v1 Announce Type: new Abstract: We study a multi-agent multi-armed bandit problem in the competitive setup with two-sided matching markets under a human centric decision making model. To capture human preferences, we use cumulative prospect theory (CPT) that weighs the actions of the agent in a nonlinear fashion using a ($\alpha$-Hölder continuous) weight function. CPT has been widely used in behavioral economics and risk sensitive machine learning to emulate human preferences. We analyze the state-of-the-art learning algorithm with CPT weight distorted rewards and obtain a player optimal regret of $\mathcal{O}(K\log T \left(\frac{1}{\Delta}\right)^{2/\alpha})$, where $K$ denotes the number of arms, $T$ is the learning horizon, and $\Delta$ represents (suitably defined) players' minimum preference gap. Noticing the dependence on $\Delta$ to be sub-optimal, we further improve this regret by judiciously selecting the active set of arms during exploration, which removes the dependence on $K$ in the dominant term and achieves an improved (optimal) regret guarantees in the setting where the number of arms $K$ is significantly larger than the number of players $N$. In addition, we consider adversarial markets where the observed rewards of the agents may be corrupted. We propose and analyze algorithms for robust markets with CPT as risk sensitive measure in both settings where the total corruption budget is known and where it is unknown, and establish logarithmic player-optimal regret guarantees in both cases.

18.
arXiv (CS.CL) 2026-06-11

AI Coding Agents Can Reproduce Social Science Findings

Recent anecdotal evidence suggests that AI coding agents can reproduce published findings when provided with original data and code; yet systematic evaluation across social sciences remains limited. Existing evaluation benchmarks are insufficient, either small or conflate agent performance with problems in the reproduction materials themselves, such as code that fails to execute correctly. Here we introduce SocSci-Repro-Bench, a benchmark of 221 tasks spanning four disciplines and 13 substantive domains, constructed from studies whose results are either fully reproducible with available materials or demonstrably non-reproducible due to missing data, allowing us to isolate agents' reproduction capacity. Evaluating two frontier coding agents, Claude Code and Codex, we find that both can reproduce a large share of social science findings, with Claude Code substantially outperforming Codex. These reproduction rates considerably exceed those previously reported for general-purpose LLM-based agents on comparable reproducibility benchmarks. Both agents also perform strongly on a reasoning task requiring identification of underlying research questions, and additional analyses suggest that results are not primarily driven by memorization. Providing the original paper PDF alongside replication materials modestly improves performance but introduces bias on tasks where reproduction is impossible. We also show that agents can be nudged toward confirmatory specification search through subtle prompt framing. Together, these findings suggest that at least some frontier coding agents can serve as reliable executors of computational workflows while underscoring the need for careful benchmarking and prompt design as AI systems assume larger roles in scientific production.

19.
arXiv (quant-ph) 2026-06-16

Non-perturbative CPMG scaling and qutrit-driven breakdown under compiled superconducting-qubit control: a single-qubit study

作者:

arXiv:2603.29525v3 Announce Type: replace Abstract: Decoherence in superconducting qubits arises from both multilevel dynamics and structured environmental noise, yet perturbative models cannot capture all resulting signatures. Here, EmuPlat couples instruction-set-architecture-level waveform generation to the hierarchical equations of motion HEOM under $1/f$ non-Markovian pure dephasing. In the resulting non-perturbative regime – where filter-function predictions become quantitatively uninformative – CPMG scaling of a three-level superconducting transmon yields one calibration result, two physical findings, and one structural null. Y-CPMG exhibits axis-dependent scaling-law breakdown – non-monotonic decoherence, partial coherence revival, and pronounced X–Y population asymmetry ($0.204$ vs ${

20.
arXiv (CS.CV) 2026-06-16

Variable-Rate Deep Image Compression based on Low-Rank Adaptation by Progressive Learning

In the digital age, image compression is crucial for numerous applications, including web media, streaming services, high-resolution medical imaging, and connected vehicle networks, enabling efficient data storage and transmission. With the increasing demand for high-quality image communication, the need for advanced compression techniques becomes increasingly critical. Numerous Deep Image Compression (DIC) techniques have recently been introduced, showing impressive performance compared to traditional standards. However, variable-rate image compression remains an unresolved issue. Specific DIC methods deploy multiple networks to attain different compression rates, whereas others use a single model, which often results in higher computational complexity and reduced performance. This work proposes a progressive learning approach for variable-rate image compression based on the parameter-efficient fine-tuning method, the Low-Rank Adaptation (LoRA). We introduce an additional LoRA Rate-Adaptive Module (LoRAM) in DIC methods. Due to the re-parameterized merging of LoRA, our proposed method does not introduce additional computational complexity during inference. Compared to methods utilizing multiple models, comprehensive experiments demonstrate that our approach achieves competitive performance, saving 99\% in parameter storage, 90% in datasets, and 97% in training steps.

21.
bioRxiv (Bioinfo) 2026-06-11

Tumour evolution as ground truth for cancer whole-genome sequencing

Cancer genomes are shaped by evolutionary processes that couple mutagenesis, clonal selection, chromosomal instability, spatial growth and treatment response into structured genomic patterns, yet current benchmarking strategies largely ignore this evolutionary dependency. Here, we present SCOUT, a large-scale synthetic whole-genome sequencing resource of over 200 samples, designed for systematic benchmarking of tumour genomic analysis and evolutionary inference under controlled evolutionary ground truth. Unlike conventional task-specific simulations, SCOUT models tumour evolution as a latent generative process that simultaneously shapes mutations, copy-number alterations, variant allele frequencies, mutational signatures and clonal architectures. SCOUT recapitulates key features of solid and haematological malignancies, including driver mutations, chromosomal instability, intratumour heterogeneity, spatial sampling and treatment-associated evolutionary dynamics in tumour and matched-normal longitudinal and multi-region sequencing designs. Using SCOUT, we benchmarked widely used methods for somatic variant detection, copy-number analysis, mutational signature inference and tumour evolutionary reconstruction. Across analytical tasks, performance deteriorated in low-purity, highly subclonal and structurally complex tumours, while spatial sampling bias and hypermutation generated spurious evolutionary signals that confounded tumour interpretation across multiple inference layers. Evolutionary simulations further distinguished lineage-restricted genetic bottlenecks from multi-lineage resistance dynamics associated with tumour plasticity. Tumour purity consistently exerted a stronger effect on inference accuracy than sequencing depth. Together, our results establish evolutionary ground truth as a prerequisite for reproducible benchmarking and biologically interpretable analysis of cancer whole-genome sequencing data.

22.
PLOS Medicine 2026-06-16

The data transparency crisis in research: Lessons from systematic reviews and meta-analyses

by Saul Martin-Rodriguez, Rodrigo Fernandez-Gonzalo, David Moher Summary points Systematic reviews and meta-analyses underpin clinical guidelines and health policy, yet their validity may be compromised by limited access to underlying datasets and associated analytical code. Reliance on incomplete or inconsistently reported summary statistics forces researchers to use imputation and unverifiable assumptions, which can distort effect estimates and mislead clinical decision-making. The consequences extend beyond methodology: flawed evidence synthesis can influence treatment recommendations, healthcare spending, and patient safety, as illustrated by historical cases such as hormone replacement therapy. Despite widespread data-sharing policies, compliance remains low, enforcement weak, and monitoring almost non-existent, with many datasets remaining unavailable or inaccessible. This Policy Forum argues for strengthening enforceable data-sharing mechanisms, including clearer enforcement and pragmatic verification approaches within editorial workflows.

23.
arXiv (CS.AI) 2026-06-17

Probing, Fusion, and Trustworthiness: A Systematic Evaluation of Foundation Model Representations for Multimodal Cancer Analysis

arXiv:2606.17115v1 Announce Type: cross Abstract: Foundation models (FMs) have emerged as powerful representation extractors for medical data, yet their generalizability to datasets under distribution shift remains underexplored. This work systematically evaluates FM-based representations on a suite of computational pathology tasks across two real-world commercial cohorts, IH-BC and IH-NSCLC, drawn from the licensed in-house (IH) oncology dataset. The analysis focuses on two modalities, whole-slide images and transcriptomic profiles, drawn from the IH multimodal data. We first benchmark unimodal probing performance across five FMs on eight downstream classification tasks, and find that image and omics representations carry complementary predictive signals. Then we investigate whether multimodal fusion can yield additional gains over unimodal baselines by comparing three image-omics fusion strategies built on paired representations. The trustworthiness of selected unimodal and multimodal pipelines is further assessed through conformal prediction. Our results show that FM representations achieve competitive performance on out-of-distribution data and that multimodal fusion helps mainly when no single modality dominates the signal. Conformal prediction reveals that in the majority of cases where a point prediction fails, the true diagnosis remains recoverable within the prediction set, reinforcing the value of uncertainty-aware inference for clinical support.

24.
medRxiv (Medicine) 2026-06-18

Early-life Urban Environment, Nutrition, and Pubertal Timing in Southern Europe: An Exposome Analysis

Background: Urban environmental and lifestyle factors during early life may influence pubertal timing, but the combined effects of multiple environmental exposures within an exposome analytical framework remain poorly understood. Objective: To examine the association between early-life urban environmental exposures and pubertal timing, and to explore whether these exposures interact with early-life nutritional factors, namely breastfeeding duration and childhood diet quality. Methods: Data from two European population-based birth cohorts were analysed: Generation XXI (G21, Portugal; n=5263; 51.5% girls) and INfancia y Medio Ambiente (INMA, Spain; n=1019; 50.1% girls). Urban environmental exposures including indicators of air pollution, traffic, built environment, and natural spaces were estimated at 4 early-life stages at both cohorts: pregnancy (INMA only), birth, 1 year, and 4-5 years of age. Pubertal development timing was assessed using Tanner staging and/or the Pubertal Development Scale (PDS), and age at menarche was self-reported. Exposome-Wide Association Study (ExWAS) models and unsupervised clustering followed by ordinal logistic regression models were used to examine single- and multi-exposure associations, respectively. Regression models were fitted adjusting for relevant child characteristics, maternal factors, and household socioeconomic conditions, and corrected for multiple testing. Results: Individuals living in more unfavourable urban environments characterised by higher building density, air pollution, and lower access to natural spaces showed earlier pubertal timing according to multiple outcomes, across multiple early-life exposure periods, and in both cohorts. In the G21 cohort, these environmental profiles were associated with earlier age at menarche, particularly for exposures at 1-1.5 and 4-5 years (e.g., 1-1.5y: {beta}=-0.172, FDR-adjusted p-value=0.041), while in the INMA cohort, boys exposed to more unfavourable environmental profiles showed more advanced pubertal development, also particularly for exposures at 1-1.5 and 4-5 years of age (e.g., 1-1.5y; {beta}=0.572, FDR-adjusted p-value=0.008). Among environmental domains, air pollution and traffic were the factors most consistently associated with pubertal timing. Regarding early-life nutritional factors, longer duration of exclusive breastfeeding was associated with a lower Tanner stage among girls in G21. No significant interactions between breastfeeding duration and environmental exposure clusters were observed. Conclusion: Early-life urban environmental exposures, particularly air pollution and traffic, may influence pubertal timing. Exclusive breastfeeding may have a protective role against earlier pubertal development. These findings highlight the importance of improving urban environmental conditions and promoting breastfeeding to support healthy developmental trajectories.

25.
arXiv (CS.CV) 2026-06-15

Efficient Online 3D Multi-Camera Multi-Object Tracking and Pose Estimation

This paper proposes a fast and online method for jointly performing 3D multi-object tracking and pose estimation using multiple monocular cameras. Our algorithm requires only 2D bounding box and pose detections, eliminating the need for costly 3D training data or computationally expensive deep learning models. Our solution is an efficient implementation of a Bayes-optimal multi-object tracking filter, enhancing computational efficiency while maintaining accuracy. We demonstrate that our algorithm is significantly faster than state-of-the-art methods without compromising accuracy, using only publicly available pre-trained 2D detection models. We also illustrate the robust performance of our algorithm in scenarios where multiple cameras are intermittently disconnected or reconnected during operation.