Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-11

Market Design for AI: Beyond the Copyright Binary

arXiv:2606.12260v1 Announce Type: cross Abstract: How can we design a market of human-generated content for use in training AI models that both enables technological progress and preserves individual incentives for high-quality content creation? Existing approaches take polar positions: a "free-for-all" model based on fair use and a "strong intellectual property rights" model. We show that both fail: Free-for-all does not compensate creators, and – by modeling as a static Stackelberg game – strong intellectual property rights also underpower creative incentives. We find this especially true for more innovative creators, a phenomenon we term the "originality penalty." Extending this insight to a dynamic model, we find another market failure undermining AI model performance, even for an initially good model: Such a model induces greater reliance by humans on AI-assisted creation, resulting in homogenized content feeding back into training, which degrades the model performance – a "curse of precision." We further propose a market design with a data intermediary internalizing cross-creator externalities and subsidizing innovative contributions, thereby restoring efficiency.

02.
medRxiv (Medicine) 2026-06-12

Does the method matter? Evaluating the effectiveness, efficiency and ease of hearing-aid gain self-adjustment

In conventional hearing-aid personalisation, clinicians cannot hear what their patients hear, and patients cannot often reliably detect or describe what they hear. Self-adjustment avoids this issue but requires user controls that adjust hearing-aid signal processing parameters to be effective, efficient and easy. In this study, we explored (a) the roles of interface complexity and stimulus type in the self-adjustment of hearing-aid gain, and (b) how well individuals can adjust one sound to match another to assess the same interfaces and stimuli. Adult hearing-aid users with mild to moderate symmetrical sensorineural hearing loss repeatedly adjusted the gain (a) to their preference from individual prescription (n = 41) and (b) to match their previous preferences from a random starting point (n = 32) using three interfaces representing different bass/mid/treble configurations and three stimuli (music, speech and speech-in-noise). The large interindividual variability in self-adjusted gains clustered into three patterns of deviation from initial prescription: increased relative bass, overall gain reduction, and close to initial prescription. There were no substantial effects of interface nor stimulus on self-adjustment reliability (median {sigma} = 2.8 dB), whereas absolute sound-matching error increased with increasing interface complexity and centre frequency. Neither individual matching accuracy nor questionnaire responses predicted either self-adjusted gains or reliability. Overall, these results show that many - but not all - hearing-aid users can adjust gains with reasonable reliability, and while it can be difficult to predict the behaviour from the individual, the individual applies a similar self-adjustment behaviour across different interfaces and stimuli.

03.
arXiv (CS.LG) 2026-06-18

ActiTect: A Generalizable Machine Learning Pipeline for REM Sleep Behavior Disorder Screening through Standardized Actigraphy

arXiv:2511.05221v3 Announce Type: replace Abstract: Isolated rapid eye movement sleep behavior disorder (iRBD) is a major prodromal marker of $\alpha$-synucleinopathies, often preceding the clinical onset of Parkinson's disease, dementia with Lewy bodies, or multiple system atrophy. While wrist-worn actimeters hold significant potential for detecting RBD in large-scale screening efforts by capturing abnormal nocturnal movements, they become inoperable without a reliable and efficient analysis pipeline. This study presents ActiTect, a fully automated, open-source machine learning tool to identify RBD from actigraphy recordings. To ensure generalizability across heterogeneous acquisition settings, our pipeline includes robust preprocessing and automated sleep-wake detection to harmonize multi-device data and extract physiologically interpretable motion features characterizing activity patterns. Model development was conducted on a cohort of 78 individuals, yielding strong discrimination under nested cross-validation (AUROC = 0.95). Generalization was confirmed on a blinded local test set (n = 31, AUROC = 0.86) and on two independent external cohorts (n = 113, AUROC = 0.84; n = 57, AUROC = 0.94). To assess real-world robustness, leave-one-dataset-out cross-validation across the internal and external cohorts demonstrated consistent performance (AUROC range = 0.84-0.89). A complementary stability analysis showed that key predictive features remained reproducible across datasets, supporting the final pooled multi-center model as a robust pre-trained resource for broader deployment. By being open-source and easy to use, our tool promotes widespread adoption and facilitates independent validation and collaborative improvements, thereby advancing the field toward a unified and generalizable RBD detection model using wearable devices.

04.
medRxiv (Medicine) 2026-06-15

Excitation-Inhibition Balance in Schizophrenia Spectrum Disorders: EEG Criticality Reflects Frontal Metabolites and a Potential Compensatory Mechanism

Background The excitation-inhibition (E-I) balance is essential for normal brain functioning, while deviations from this balance have been implicated in several psychiatric disorders. However, the extent to which electroencephalography (EEG) and proton magnetic resonance spectroscopy (1H-MRS) E-I markers are altered in schizophrenia spectrum disorders (SSD), how they converge across modalities, and how they relate to cognitive performance and clinical symptoms remain insufficiently characterized. Methods We recruited 111 healthy controls (HC) and 113 individuals with SSD. All participants underwent resting-state EEG and 1H-MRS. Metabolites were measured either in the anterior cingulate cortex (ACC; NSSD = 63, NHC = 58) or in the left dorsolateral prefrontal cortex (lDLPFC; NSSD = 50, NHC = 53), from which gamma-aminobutyric acid (GABA), glutamate + glutamine (Glx), and the Glx/GABA ratio were extracted. Extracted EEG E-I markers included oscillatory activity, aperiodic activity, functional E-I, microstates, multiscale entropy, and neuronal avalanche criticality. Results MRS results showed no group differences in GABA, Glx, or the Glx/GABA ratio. In contrast, most EEG-derived E-I markers indicated increased cortical inhibition in SSD, including steeper aperiodic exponents, prolonged microstate durations, and greater prevalence of subcritical states. However, functional E-I showed a divergent pattern, suggesting balanced dynamics in SSD and relatively inhibition-weighted dynamics in HC. Across groups, higher ACC and lDLPFC GABA predicted a lower kappa index, whereas a higher lDLPFC Glx/GABA ratio was associated with a higher kappa index. In SSD, reduced avalanche criticality was associated with better cognition and less severe symptoms. Conclusion Several EEG-derived E-I proxies, but not MRS measures, indicate an increased cortical inhibition in SSD. Criticality indices best capture frontal neurochemical metabolites and improvements in clinical symptoms, potentially reflecting inhibitory compensation mechanisms in SSD.

05.
arXiv (quant-ph) 2026-06-16

Detecting basis-dependent hardware errors through spatio-temporal quantum steering

arXiv:2606.16451v1 Announce Type: new Abstract: Spatio-temporal quantum steering provides a framework for benchmarking the nonclassicality of general quantum state transfer processes. A central diagnostic is the no-signaling-in-time (NSIT) condition, whose violation can indicate basis-dependent hardware errors. However, finite measurement statistics may also yield apparent violations, thereby obscuring the detection of basis-dependent hardware errors. To address this, we construct a statistical hypothesis test under the null hypothesis that NSIT violations arise solely from statistical fluctuations. Combining the statistical properties of NSIT violation under the null hypothesis with Chebyshev's inequality, we obtain a distribution-free upper bound on the $p$-value without parametric assumptions. We apply this method to two examples. For a single-qubit state-transfer experiment on a superconducting processor, we observe several instances that the NSIT violation is observed and the null hypothesis is simultaneously rejected by a small $p$-value, providing statistical evidence of basis-dependent hardware errors. For a seven-qubit Hayden-Preskill teleportation protocol on IonQ devices, the null hypothesis is also rejected even when the average fidelity exceeds the classical threshold, while the associated nonclassicality measure vanishes. Our results highlight the necessity of statistical hypothesis testing for detecting basis-dependent errors in near-term quantum devices.

06.
arXiv (CS.CL) 2026-06-16

When Correct Edges Cannot Be Verified: A Provenance Gap in Incomplete KGQA and a Provenance-Favoring Completion Policy

Incomplete Knowledge Graph Question Answering (IKGQA) requires completing missing edges to continue reasoning. A growing line of work verifies completed edges against retrieved text, treating textual support as a proxy for edge quality. We ask a question that, to our knowledge, has not been systematically tested: does textual verifiability actually track correctness? Exploiting the gold deleted triples provided by the standard random-deletion protocol, we measure both. The finding is counterintuitive: among gold-correct completed edges, 76-96% have no supporting passage even under exhaustive retrieval, robustly across deletion rates (20%/40%), datasets (CWQ/WebQSP), and relation types (structural, commonsense, long-tail). Most Freebase-style facts simply do not occur as head-tail co-mentions in text. Textual faithfulness therefore measures provenance, not correctness – separated by a paradigm-level gap no in-corpus retrieval closes. This reframes edge completion. Since most completed edges – correct or not – are causally redundant for the answer (95-97% of correct answers do not depend on any unsupported edge), the central question shifts from "is the edge correct?" to "admit or abstain under provenance uncertainty?" Within this framing we present TGComplete, a provenance-favoring admission policy that retrieves evidence at a reasoning breakpoint, verifies a candidate through a lightweight loop, and abstains when support is absent. Against the generate-to-complete baseline GoG, it attains higher edge precision against gold (15-21% vs 3-14%), with no statistically detectable EM loss and 3.1-7.4 times higher strict faithfulness of admitted edges – at the cost of lower recall. We position TGComplete not as uniformly better, but as a principled point on a precision/provenance-recall trade-off, appropriate when auditability matters.

07.
arXiv (quant-ph) 2026-06-12

Approximate quantum error correction theory of non-isometric codes

arXiv:2606.13559v1 Announce Type: new Abstract: Non-isometric encoding arises in various important contexts in quantum error correction, most notably in the finite-energy, non-ideal codewords inevitable in experimental realizations of continuous-variable codes, and holographic quantum gravity. In this work, we present a general and systematic theory of non-isometric quantum error-correcting codes. In particular, we employ the approximate quantum error correction framework to quantitatively study the fundamental limitations imposed by non-isometric encodings on the accuracy of quantum error correction and implementation of logical operations. We apply our theory to analyze GKP and tiger codes under energy constraints, and discuss the implications to holography.

08.
arXiv (CS.LG) 2026-06-25

LLM Program Optimization via Retrieval Augmented Search

arXiv:2501.18916v2 Announce Type: replace Abstract: Recent work has demonstrated the potential of large language models (LLMs) for program optimization, a key challenge in programming languages. We propose a blackbox adaptation method called Retrieval Augmented Search (RAS) that performs beam search over candidate optimizations; at each step, it retrieves in-context examples from a given training dataset of slow-fast program pairs to guide the LLM. Critically, we find that performing contextual retrieval based on an LLM-generated natural language description significantly outperforms retrieval based on the source code. We also propose AEGIS, a method for improving interpretability by decomposing training examples into ''atomic edits'' that are significantly more incremental in nature. We show that RAS performs up to 2.06$\times$ better than prior state-of-the-art blackbox adaptation strategies on optimizing C++ programs, and that AEGIS performs up to 1.37$\times$ better while making significantly smaller edits. We also show that using RAS improves the mean runtime percentile of Python programs by 10.27 compared to baselines.

09.
arXiv (CS.CL) 2026-06-11

Notes2Skills: From Lab Notebooks to Certainty-Aware Scientific Agent Skills

Scientific discovery workflows usually contain and rely heavily on lab notes, where researchers record observations, interpret uncertain results, and plan follow-up experiments. Such informative lab notes preserve evolving scientific reasoning and author uncertainty, rather than polished final results exhibited in publications, providing a valuable opportunity for AI to engage in scientific exploration at a more comprehensive and deeper level. However, most prior work on scientific text focuses on papers, protocols, or structured databases, leaving informal laboratory notes underexplored as inputs to AI agents for science. This gap matters because lab notes often intermingle validated observations, tentative judgments, and possible experimental next steps within the same passage. If these signals are conflated, an AI agent may mistake uncertain scientific judgments for confirmed conclusions or executable actions. To this end, we present Notes2Skills, a two-stage framework for turning lab notebooks into verifiable skills for scientific AI agents while preserving the author's certainty. Across seven conditions and three wet-lab sessions, Notes2Skills is the only configuration that neither mistakes uncertain notes for firm instructions nor discards firm ones. We show that certainty preservation is the missing piece between lab notebooks and reliable agent skills, opening a path toward safer AI co-scientist systems.

10.
arXiv (CS.AI) 2026-06-17

Brick-DICL: Dynamic In-Context Learning for Automated Brick Schema Classification

arXiv:2606.17637v1 Announce Type: new Abstract: Building Management Systems (BMS) are essential for optimizing energy efficiency and operational performance in modern buildings. However, the lack of standardization across BMS points from different manufacturers creates significant barriers to integration and data utilization. While the Brick schema offers a standardized ontology for building systems, mapping BMS points to appropriate Brick classes presents three critical challenges: (i) the extensive number of Brick classes (936 in the latest version), (ii) limited domain-specific knowledge in large language models (LLMs), and (iii) substantial manual effort required for verification. To address these challenges, we propose Brick-DICL, a two-stage dynamic in-context learning framework for automated Brick schema classification. Brick-DICL consists of two primary components: metadata-RAG, which retrieves relevant examples to enhance LLMs' domain knowledge, and class-RAG, which narrows down potential Brick classes to address the large classification space. Additionally, we implement a multi-LLM filtering mechanism that compares predictions across multiple models, flagging low-confidence classifications for human review. As a result: (i) General: Brick-DICL is applicable to any building management system regardless of manufacturer or metadata format; (ii) Novel and Powerful: as the first dynamic in-context learning approach for Brick schema classification, Brick-DICL achieves significant classification accuracy improvements on building datasets, outperforming existing methods; (iii) Efficient: our multi-LLM filtering strategy reduces manual verification effort, enabling rapid digital building onboarding. Extensive experiments demonstrate Brick-DICL's effectiveness across diverse building datasets, accelerating the path toward standardized, interoperable building management systems.

11.
medRxiv (Medicine) 2026-06-24

Durability and Seasonal Variation in the Effectiveness of Nirsevimab over Three Seasons in Connecticut

Background Nirsevimab has been widely administered in the United States since 2023 to protect infants and young children from severe disease caused by respiratory syncytial virus (RSV). Although early post-licensure studies have shown high effectiveness against medically attended RSV infection, uncertainty remains about the durability of protection, effectiveness beyond the first RSV season, and the extent to which changing RSV seasonality influences real-world effectiveness. Objective To estimate the effectiveness of nirsevimab against medically attended RSV infection across three consecutive RSV seasons and to examine how effectiveness varies by season and time since immunization. Methods We conducted a test-negative case-control study utilizing electronic health records of infants and young children tested for RSV by polymerase chain reaction in outpatient and inpatient settings within the Yale New Haven Health System between October 1, 2023, and March 1, 2026. Effectiveness of nirsevimab was estimated using multivariable logistic regression, adjusting for age, weekly RSV activity, pre-existing risk factors, and other potential confounders. Variation in effectiveness was examined by season, encounter setting, and time since immunization up to 24 months. Results Overall, 17,755 infants and young children were tested for RSV infection, of whom 2,388 (13.4%) were cases and 15,367 (86.6%) were controls. The overall effectiveness of nirsevimab was 67.3% (95% confidence interval [CI]: 59.8, 73.3%) against all medically-attended RSV infections, 60.2% (95% CI: 49.6, 68.5%) against RSV-associated outpatient visits, and 88.9% (95% CI: 82.3, 93.0%) against RSV-associated hospitalization. Effectiveness against medically attended RSV infection declined across seasons, from 76.7% (95% CI: 60.5, 86.3%) in 2023/24 to 54.4% (95% CI: 33.0, 68.9%) in 2025/26. Lower season-specific effectiveness in later seasons corresponded with progressively delayed RSV activity over. Protection against RSV-associated hospitalization declined with increasing time since immunization, from 92.5% (95% credible interval [CrI]: 85.9, 96.4%) at 1 month, to 77.2% (95% CrI: 60.4, 87.6%) at 6 months, and 39.9% (95% CrI: 2.4, 63.3%) at 12 months post-immunization, after which effectiveness plateaued. Conclusions Nirsevimab remained effective against RSV-associated hospitalization through 6 to 12 months after immunization. Delayed RSV activity was associated with lower effectiveness, highlighting the importance of aligning administration with local RSV circulation.

12.
arXiv (quant-ph) 2026-06-12

Quantized time in quantum walks under weak rank-K measurements

arXiv:2606.13552v1 Announce Type: new Abstract: Measurements can be used to monitor the evolution of quantum systems and may lead to a universally quantized time statistics. It is known that the mean return time is quantized for strong and indirect monitoring through the winding number of the return amplitude in a one-dimensional space. Here we discuss that under multi-channel strong or indirect monitoring, where the latter is achieved through ancilla coupling, the mean return time of a quantum walk in the projected subspace is also quantized. This reflects a universal time quantization for a higher dimensional evolution.

13.
arXiv (CS.LG) 2026-06-16

A Penalty Approach for Differentiation Through Black-Box Quadratic Programming Solvers

arXiv:2602.14154v3 Announce Type: replace Abstract: Differentiating through the solution of a quadratic program (QP) is a central problem in differentiable optimization. Most existing approaches differentiate through the Karush–Kuhn–Tucker (KKT) system, but their computational cost and numerical robustness can degrade at scale. To address these limitations, we propose dXPP, a penalty-based differentiation framework that decouples QP solving from differentiation. In the solving step (forward pass), dXPP is solver-agnostic and can leverage any black-box QP solver. In the differentiation step (backward pass), we map the solution to a smooth approximate penalty problem and implicitly differentiate through it, requiring only the solution of a much smaller linear system in the primal variables. This approach bypasses the difficulties inherent in explicit KKT differentiation and significantly improves computational efficiency and robustness. We evaluate dXPP on various tasks, including randomly generated QPs, large-scale sparse projection problems, and a real-world multi-period portfolio optimization task. Empirical results demonstrate that dXPP is competitive with KKT-based differentiation methods and achieves substantial speedups on large-scale problems. Our implementation is open source and available at https://github.com/mmmmmmlinghu/dXPP.

14.
arXiv (CS.CV) 2026-06-25

A Controlled Study of CLIP-Based Body-Scene Fusion for Emotion Recognition in Context

Apparent emotion in natural images is often not visible from the face alone. The face may be small, hidden, or neutral, while posture and scene context carry much of the evidence. This work studies context-aware emotion recognition on EMOTIC with an image-only two-stream model. A ResNet-18 body stream encodes the target-person crop, and a CLIP ViT-B/16 scene stream encodes the full image. The fused feature predicts 26 categorical emotion labels and the continuous valence, arousal, and dominance values. This study examines whether small context-debiasing or rare-class training changes still help after adding a CLIP scene encoder. The clean two-stream model is compared with simplified CCIM-style intervention, CLEF-lite context-bias subtraction, ASL tuning, and class-balanced sampling under the same implementation pipeline. No tested variant improves over the clean two-stream model, which achieves 34.52% mAP on the EMOTIC test split. CLIP gives the model broad scene semantics, but the simplified causal, counterfactual, and rare-class changes do not automatically improve performance. Most remaining errors are in rare and subtle emotion categories, so the next step should focus on label relationships and finer subject-context interaction.

15.
Science (Express) 2026-05-21

Nodeless superconducting gap and electron-boson coupling in (La,Pr,Sm)3Ni2O7 films | Science

作者: 未知作者

The discovery of superconductivity in Ruddlesden-Popper (RP) bilayer nickelate films under ambient pressure provides an opportunity to directly investigate electronic energy scales of the superconducting state and the pairing mechanism. We report angle-resolved photoemission spectroscopy measurements of superconducting (La,Pr,Sm) 3 Ni 2 O 7 thin films by developing an ultra-high vacuum cryogenic sample quenching and transfer technique. A superconducting gap of ~18 meV with coherence peaks is observed along the Brillouin zone diagonal. The finite gap persists across the entire Brillouin zone, revealing the absence of gap nodes. A kink is observed in the energy-momentum dispersion at ~70 meV below Fermi level, indicating an electron-boson coupling. The simultaneous observation of a nodeless superconducting gap and electron-boson coupling provides insight into the pairing symmetry and gluing mechanism in RP bilayer nickelates.

16.
arXiv (quant-ph) 2026-06-24

Spectator-transition crosstalk in a spin-3/2 silicon vacancy qudit in silicon carbide revealed by broadband Ramsey interferometry

arXiv:2601.15559v3 Announce Type: replace Abstract: Color center spins in 4H-SiC offer a rare combination of wafer-scale materials maturity with long spin coherence and chip-level photonics, making them promising building blocks for scalable quantum technologies. In particular, the silicon vacancy hosts an S=3/2 ground state, a native qudit that enables compact encodings and subspace-selective control, but also introduces spectator transitions: short, detuned pulses can coherently drive non-addressed level pairs and create crosstalk. Here we use broadband Ramsey interferometry to reveal and quantify such spectator-transition crosstalk. Experimentally, the Ramsey Fourier spectra display multiple lines beyond the addressed single-quantum transition. Analytically, we map each line to a pairwise energy difference between qudit levels of the rotating-frame Hamiltonian and assign its weight via compact amplitudes set by the prepared state and the microwave pulse parameters, predicting a deterministic six-branch structure. Numerical time-domain propagation with the experimental sampling reproduces the detuning map, and the measured peak positions coincide with the analytic branch lines without frequency fitting. Together these results provide a practical, spectator-aware framework for multilevel control in the silicon vacancy qudit. The approach offers clear guidance to suppress crosstalk or, conversely, to exploit spectator lines, for example as additional constraints for in situ pulse calibration and for phase-sensitive quantum state and process estimation.

17.
arXiv (quant-ph) 2026-06-19

Random Local Stabilizer Codes in Three Dimensions without String or Self-Similar Fractal Logical Operators

作者:

arXiv:2606.19873v1 Announce Type: new Abstract: Quantum error-correcting codes (QECs) are essential components quantum computation and have deep connections to quantum phases of matter. A key obstruction to passive self-correcting QECs is the presence of string logical operators, which can generate logical errors through constant-energy-barrier processes. Haah's Codes (fracton codes) showed that three-dimensional stabilizer codes can forbid such string logical operators, but their translation-invariant structure supports self-similar fractal logical operators with a logarithmic energy barrier. We introduce the qutrit random cubic codes, a family of local qutrit Calderbank-Shor-Steane stabilizer Hamiltonians with similar cube-check structure as Haah's Code 1 but built from spatially varying stabilizers. We prove that these models retain the no-string property and numerically observe that they have properties distinct from translation-invariant fracton codes: the smallest ground-state degeneracy exponent is $k=2$ for odd $L$ and $k=4$ for even $L$; noncontractible plane-logical operators span the entire logical space; and charge-push diagnostics show that the self-similar fractal operators are absent. These results demonstrate that constrained randomness can fundamentally change the nature of stabilizer codes and improve their self-correction properties. They further point to broader families of quantum error-correcting codes and quantum phases beyond canonical topological and fracton orders.

18.
medRxiv (Medicine) 2026-06-17

County Year Informatics Model for Annual and Cumulative Unique Lung Cancer Screening Eligibility in Maryland, 2026 to 2045

Purpose: Population-level lung cancer screening programs require denominators that reflect age, smoking history, geography, and changing eligibility over time. We estimated annual prevalent and 20-year cumulative unique low-dose computed tomography screening eligibility for Maryland residents under alternative screening criteria. Methods: We built a deterministic cohort-cell stock-flow simulation using Maryland county-equivalent jurisdiction projections by age, sex, and race/ethnicity, with ACS socioeconomic/nativity covariates and smoking-history priors for ever-smoked status, pack-years, and quit-years. Scenarios included USPSTF 2013 legacy, USPSTF 2021, ACS 2023/2024, a risk-model-expanded sensitivity, and ever-smoked-only capacity stress tests. Cumulative unique eligibility counted people once at first eligibility rather than summing annual prevalent person-years. Results: Under USPSTF 2021, an estimated 238,346 Maryland residents were eligible in 2026 and 245,326 in 2045. The 20-year cumulative unique denominator was 768,668, whereas naively summing annual prevalent counts produced 4,850,735 person-years, a 6.31-fold overcount. ACS 2023/2024 expanded annual eligibility to 314,616 in 2026 and cumulative unique eligibility to 902,796 by adding remote former smokers. Ever-smoked-only adult eligibility was 1,957,699 in 2026 and 3,383,683 cumulative unique over 20 years. Conclusion: A Maryland statewide screening initiative should plan from cumulative unique eligibility and county-equivalent jurisdiction-specific burden rather than annual prevalence alone. Explicit pack-year and quit-year modeling materially changes statewide and county allocation compared with current-smoking proxy models.

19.
arXiv (CS.LG) 2026-06-16

Machine learning enables roughness-driven inverse design of milling processes

arXiv:2606.16032v1 Announce Type: cross Abstract: Interest in applying data-driven approaches in manufacturing has grown significantly, particularly for mapping complex, high-dimensional relationships. The milling process is one area where predictive models can link influential parameters to surface roughness metrics prior to in situ operations. While this approach offers clear advantages, it faces challenges due to limited datasets and robustness issues in inverse design paradigms. To address these challenges, this paper proposes a machine learning (ML)-based framework for the inverse design of the surface milling process, with a focus on surface roughness as the design objective. The framework employs forward training of two ML models, a deep neural network (DNN) and a random forest (RF) ensemble, both developed using a high-fidelity synthetic dataset generated from a computational simulation framework. These trained models are integrated into a Bayesian optimization (BO) procedure to overcome the multiplicity problem arising from the many-to-one mapping inherent in the dataset. The approach identifies top-performing milling process configurations, considering both process and tool parameters, and presents them from the full solution space. The models achieve average relative errors below 5% when compared to reference results, thereby demonstrating the robustness and reliability of the proposed methodology.

20.
arXiv (quant-ph) 2026-06-16

Experimental Observation of Dynamical Phase Transitions in a Dephased Photonic Quantum Walk

arXiv:2606.15935v1 Announce Type: new Abstract: Dynamical phase transitions in open quantum systems govern how non-equilibrium states relax toward a stationary state. We study these transitions experimentally using a discrete-time photonic quantum walk on a three-node graph. A tunable synthetic gauge flux and calibrated dephasing allow us to control time-reversal symmetry and the detailed balance properties of the effective Markovian dynamics. With detailed balance, we observe a first-order dynamical phase transition marked by a crossing of real Liouvillian eigenvalues. When detailed balance is broken, we observe a second-order dynamical phase transition at an exceptional point where eigenvalues and eigenvectors coalesce. By progressively reducing the dephasing strength, we track the crossover toward the quantum-coherent regime and determine that the transitions persist down to a finite threshold. Our results link Liouvillian spectral topology to relaxation criticality and demonstrate a controllable platform for engineered dissipative dynamics.

21.
arXiv (CS.AI) 2026-06-12

Equivariant Flow Matching for Symmetry-Breaking Bifurcation Problems

arXiv:2509.03340v4 Announce Type: replace-cross Abstract: Bifurcation phenomena in nonlinear dynamical systems often lead to multiple coexisting stable solutions, particularly in the presence of symmetry breaking. Deterministic machine learning models are unable to capture this multiplicity, averaging over solutions and failing to represent lower-symmetry outcomes. In this work, we formalize the use of generative AI, specifically flow matching, as a principled way to model the full probability distribution over bifurcation outcomes. Our approach builds on existing techniques by combining flow matching with equivariant architectures and an optimal-transport-based coupling mechanism. We generalize equivariant flow matching to a symmetric coupling strategy that aligns predicted and target outputs under group actions, allowing accurate learning in equivariant settings. We validate our approach on a range of systems, from simple conceptual systems to physical problems such as buckling beams and the Allen–Cahn equation. The results demonstrate that the approach accurately captures multimodal distributions and symmetry-breaking bifurcations. Moreover, our results demonstrate that flow matching significantly outperforms non-probabilistic and variational methods. This offers a principled and scalable solution for modeling multistability in high-dimensional systems.

22.
arXiv (CS.CV) 2026-06-17

Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

Vision backbone networks play a central role in modern computer vision. Enhancing their efficiency directly benefits a wide range of downstream applications. To measure efficiency, many publications rely on MACs (Multiply Accumulate operations) as a predictor of execution time. In this paper, we experimentally demonstrate the shortcomings of such a metric, especially in the context of edge devices. By contrasting the MAC count and execution time of common architectural design elements, we identify key factors for efficient execution and provide insights to optimize backbone design. Based on these insights, we present LowFormer, a novel vision backbone family. LowFormer features a streamlined macro and micro design that includes Lowtention, a lightweight alternative to Multi-Head Self-Attention. Lowtention not only proves more efficient, but also enables superior results on ImageNet. Additionally, we present an edge GPU version of LowFormer, that can further improve upon its baseline's speed on edge GPU and desktop GPU. We demonstrate LowFormer's wide applicability by evaluating it on smaller image classification datasets, as well as adapting it to several downstream tasks, such as object detection, semantic segmentation, image retrieval, and visual object tracking. LowFormer models consistently achieve remarkable speed-ups across various hardware platforms compared to recent state-of-the-art backbones. Code and models are available at https://github.com/altair199797/LowFormer/blob/main/Beyond_MACs.md.

23.
arXiv (CS.CV) 2026-06-12

Cascade Classification of Dermoscopic Images of Skin Neoplasms with Controllable Sensitivity and External Clinical Validation

Purpose. To compare deep learning architectures and classification schemes for dermoscopic images of skin neoplasms and assess their generalization on transfer from open international datasets to independent clinical datasets of Russian practice. Methods. Four architectures (ViT-B/16, Swin-S, ConvNeXt-S, EfficientNetV2-S) were compared in three schemes: binary (malignant/benign), single-stage four-class (benign, MEL, SCC, BCC), and a two-stage cascade (binary triage, then three-class differentiation MEL/SCC/BCC). All models used ImageNet-pretrained weights and a single augmentation protocol on aggregated open ISIC Archive data, and were evaluated on an internal held-out sample and two clinical datasets (Melanoscope AI mobile system; Sechenov University). Results. Internally the binary stage attains ROC-AUC 0.952-0.966; on Sechenov University it drops to 0.797-0.893, sensitivity to 0.53-0.67, and ECE rises from 0.02 to 0.27-0.39 with underestimation of malignancy, quantifying a generalization gap in ranking and calibration. Paired tests confirm one inter-architecture result on clinical data: the deficit of ViT-B/16 at the binary stage (p

24.
arXiv (CS.CL) 2026-06-19

PsyScore: A Psychometrically-Aware Framework for Trait-Adaptive Essay Scoring and ZPD-Scaffolded Feedback

Effective Automated Essay Scoring (AES) are expected to support both reliable assessment and actionable instructional feedback. However, existing approaches often treat scoring and feedback as separate components: neural scoring models provide limited interpretability, while Large Language Model (LLM)-based feedback is typically insensitive to learners proficiency levels. To address this fragmentation, this work proposes PsyScore, a psychometrically-aware framework that integrates diagnostic assessment with instructional scaffolding through a shared latent ability representation. PsyScore comprises three key modules: a Trait-Adaptive Neural IRT Scorer that incorporates the Graded Partial Credit Model (GPCM) into a neural architecture, enabling the precise estimation of student ability while maintaining psychometric interpretability, a ZPD-Scaffolded Feedback Generator, which conditions multi-agent feedback strategies on the diagnosed ability parameter to adapt instructional focus across different proficiency levels, and a Multi-Perspective Feedback Evaluation Strategy that assesses feedback quality via pairwise preference judgements and student revision simulations. Experiments on the ASAP++ dataset demonstrate that PsyScore achieves competitive scoring performance while providing more pedagogically aligned feedback.

25.
arXiv (CS.CV) 2026-06-19

FrozenDrive: Zero-Shot Text-Guided Driving Scene Generation and Data Augmentation with Parameter-Free Frozen Diffusion Model

Synthetic data for autonomous driving is surging, powered by diffusion models that promise scalable scene generation. Yet key obstacles remain, as enforcing multi-view and temporal consistency often relies on backbone fine-tuning or added layers, which erodes pre-trained knowledge and weakens text alignment. Models also stay close to the training distribution, struggling under adverse weather and unseen configurations, and fidelity favors frequent over rare classes. We address these gaps with FrozenDrive, a controllable generative framework that preserves a pretrained diffusion models knowledge while achieving strong consistency. FrozenDrive conditions on rich driving-stack signals and text prompts, and introduces knowledge-preserving spatio-temporal attention to impose cross-view alignment and temporal coherence in a single pass within a parameter-free frozen diffusion backbone. An additional object-focused constraint improves per-object fidelity for rare categories. Without any weather- or scene-specific fine-tuning, our model synthesizes globally coherent multi-view driving scenes from text, particularly under adverse and rare conditions, and surpasses prior baselines. On nuScenes, FrozenDrive augmented data significantly improves AD models performance, especially at night and in rain, demonstrating stronger robustness when trained with our scenario-targeted data.