arXiv - Computer Science: Artificial Intelligence - maiweb

arXiv - Computer Science: Artificial Intelligence arxiv.org ai arxiv computer-science preprint research science 2026-07-10 04:00

arXiv:2607.02360v3 Announce Type: replace-cross Abstract: Monocular spacecraft 6D pose estimation remains difficult under weak texture, thin structures, illumination variation, and occlusion. This article presents GAP-GDRNet, a geometry-aware RGB framework built on GDR-Net...

arXiv:2607.02360v3 Announce Type: replace-cross Abstract: Monocular spacecraft 6D pose estimation remains difficult under weak texture, thin structures, illumination variation, and occlusion. This article presents GAP-GDRNet, a geometry-aware RGB framework built on GDR-Net for a single-target synthetic spacecraft benchmark. The method strengthens the geometry-guided regression pipeline at two points. First, AFR is placed before dense geometric prediction to combine global structural attention with local weak-texture enhancement. Second, PGSA is inserted into Patch-PnP to relate downsampled geometric regions before final pose regression. Dense supervision is obtained from a Blender-based rendering and annotation process that provides masks, model-coordinate maps, camera intrinsics, and 6D pose labels. On the self-built spacecraft dataset, GAP-GDRNet achieves a rotation error of $1.96^\circ$, a translation error of 0.0165 m, and 95.16\% ADD@0.02 m, outperforming the reproduced GDR-Net baseline by 3.88 percentage points while running at 35.97 FPS. Tests on T-LESS and LM-O further show consistent gains over the reproduced baseline on textureless and occluded non-spacecraft objects.
- GAP-GDRNet: Geometry-aware monocular 6D pose estimation for spacecraft using synthetic geometric supervision arXiv - cs.AI
arXiv - Computer Science: Artificial Intelligence arxiv.org ai arxiv computer-science preprint research science 2026-07-10 04:00

arXiv:2607.05583v2 Announce Type: replace-cross Abstract: Contemporary language models are dominated by the transformer architecture, which leverages self-attention mechanisms to enable more efficient, parallelized training across a wide set of documents and corpora. This has...

arXiv:2607.05583v2 Announce Type: replace-cross Abstract: Contemporary language models are dominated by the transformer architecture, which leverages self-attention mechanisms to enable more efficient, parallelized training across a wide set of documents and corpora. This has allowed transformers to effectively model data across a wide range of modalities and contexts. However, transformers, along with their conventional counterparts such as recurrent neural networks (RNNs) and convolutional neural networks (CNNs), often struggle to maintain efficiency when processing long contexts. We introduce ResonatorLM, a new mechanism that replaces attention with a physics-derived alternative. ResonatorLM treats token sequences as a single, driven one-dimensional latent field and replaces attention dot products with causal functions of damped resonators. We implement ResonatorLM on a traditional network architecture and test it on standard long-context modeling tasks. We find that in a small, 6M matched setting, training and prefill speedups increase with sequence length, decode speed reaches 6.47x compared to that of a standard, optimized transformer at 32K tokens, and accuracy reaches 61.31 percent (compared to 55.32 percent) on WikiText.
- OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation arXiv - cs.CL
- ResonatorLM: Causal Resonant Field Mixing for Efficient Long-Context Language Modeling arXiv - cs.AI
arXiv - Computer Science: Artificial Intelligence arxiv.org ai arxiv computer-science preprint research science 2026-07-10 04:00

arXiv:2607.02615v2 Announce Type: replace-cross Abstract: Generating structured artifacts with Large Language Models - e.g.\ database queries, threat framework mappings, entity schemas - is relatively straightforward; however, making them reliable enough for production...

arXiv:2607.02615v2 Announce Type: replace-cross Abstract: Generating structured artifacts with Large Language Models - e.g.\ database queries, threat framework mappings, entity schemas - is relatively straightforward; however, making them reliable enough for production deployments presents challenges. We present TAG, a lightweight framework based on a core principle: \textit{LLMs generate, we validate}. This reframing shifts responsibility from generation quality to validation rigor. The framework rests on three key attributes: First, \textbf{test driven generation}: when tests fail, the LLM receives indicative error messages that expose why the output failed, enabling the LLM to understand its mistakes and refine subsequent attempts. Second, \textbf{deterministic and LLM-based tests}: deterministic tests catch heuristics that can be programmatically verified (schema, syntax, cross-reference), while LLM-based tests evaluate nuanced semantic and delicate features that resist programmatic inspection (intent alignment, logical consistency, domain correctness). Third, \textbf{expert-distilled judges}: LLM-based tests are calibrated to distill and replicate human expert decision distribution, transforming manual human quality gates into scalable, reusable evaluation proxies that reflect professional-grade validation standards. We demonstrate the framework on three artifact types in the security domain - KQL query generation, MITRE ATT\&CK mapping, and entity mapping - deployed in production at Microsoft Sentinel. We believe this framework can be applied beyond security to other artifact generation tasks, providing a path to reliable, high-quality outputs without sacrificing the efficiency gains of LLM generation.
- Governing Generative AI Across Financial Institutions: An SR 26-2-Compatible Framework for Generative AI Risk Control arXiv - Computer Science: Machine Learning
- Governing Generative AI Across Financial Institutions: An SR 26-2-Compatible Framework for Generative AI Risk Control arXiv - cs.LG
- EvalSafetyGap: A Hybrid Survey and Conceptual Framework for LLM Evaluation-Safety Failures arXiv - cs.CL
- TAG: A Lightweight Framework for Test-Driven Agentic Artifact Generation arXiv - cs.AI
arXiv - Computer Science: Artificial Intelligence arxiv.org ai arxiv computer-science preprint research science 2026-07-10 04:00

arXiv:2607.03013v2 Announce Type: replace-cross Abstract: Images captured by consumer electronic devices, such as mobile phones and digital cameras, often suffer from low-light degradation due to sensor limitations and imaging pipelines, which degrades visual quality and...

arXiv:2607.03013v2 Announce Type: replace-cross Abstract: Images captured by consumer electronic devices, such as mobile phones and digital cameras, often suffer from low-light degradation due to sensor limitations and imaging pipelines, which degrades visual quality and affects downstream vision tasks. Existing methods based on Convolutional Neural Networks (CNNs) and Transformers have dominated current low-light image enhancement (LIE) due to their excellent ability to model hierarchical features. However, CNNs operate in local receptive fields that cannot model long-range dependencies, while Transformers overcome this problem but incur substantial computational costs. To address these challenges, we propose MambaLIE, a Scene Light Intensity-Boosted Low-Light Image Enhancement method based on a State Space Model (SSM). We first introduce scene light intensity to improve the structural distribution of illumination, which is then gated with the low-light input to guide enhancement. To better model the illumination while maintaining computational efficiency, we propose the Locally Enhanced State Space Model (LESSM) for efficient light enhancement. Our LESSM contains two branches: an SSM branch and a Local Enhanced branch, where the former is used to model the long-range dependencies with linear time complexity, while the latter is used to enhance local feature representations. Extensive experiments demonstrate that MambaLIE outperforms state-of-the-art CNN-based and Transformer-based LIE methods on four widely used synthetic benchmarks and five publicly available real-world benchmarks in terms of accuracy, speed, and model size, making it suitable for practical deployment on resource-constrained devices.
- MambaLIE: Scene Light Intensity-Boosted Low-Light Image Enhancement with State Space Model arXiv - cs.AI
arXiv - Computer Science: Artificial Intelligence arxiv.org ai arxiv computer-science preprint research science 2026-07-10 04:00

arXiv:2607.05061v2 Announce Type: replace-cross Abstract: Key-value (KV) cache growth is a major bottleneck in autoregressive decoding, as memory and bandwidth scale linearly with context length. Existing KV eviction methods often rely on static heuristics or proxy scores,...

arXiv:2607.05061v2 Announce Type: replace-cross Abstract: Key-value (KV) cache growth is a major bottleneck in autoregressive decoding, as memory and bandwidth scale linearly with context length. Existing KV eviction methods often rely on static heuristics or proxy scores, which poorly track future token utility and cause brittle eviction as relevance shifts. To address this, we introduce KVpop, which learns a fixed-budget KV eviction policy by directly supervising the keep-or-drop decision. The scorer is trained against a novel future-attention target, computed efficiently without materializing dense attention maps. We further introduce a delayed memory-based scorer that, uniquely among learned eviction methods, defers scoring for a fixed number of steps to exploit near-future context. On AIME and HMMT mathematical reasoning, KVpop retains 98% of full-attention performance on Qwen3-4B at 75% KV cache compression and 97% at 88% compression, consistently outperforming established eviction baselines. Qwen3-8B shows even stronger results, reaching near-full teacher performance. These results show that supervising eviction with future-attention signals cuts memory costs while maintaining quality.
- KVpop -- Key-Value Cache Compression with Predictive Online Pruning arXiv - cs.AI
arXiv - Computer Science: Artificial Intelligence arxiv.org ai arxiv computer-science preprint research science 2026-07-10 04:00

arXiv:2607.04983v2 Announce Type: replace-cross Abstract: This article is about the development of a fuzzy cognitive map using a local large language model. In the light of recent advances it is evident that large language models, and even local large language models are...

arXiv:2607.04983v2 Announce Type: replace-cross Abstract: This article is about the development of a fuzzy cognitive map using a local large language model. In the light of recent advances it is evident that large language models, and even local large language models are capable of extracting quantities from textual data. In other words, a local LLM like Qwen2.5-32B, or probably larger, can accept entities as prompt input and determine relevant quantitative data as the model output. In turn, this output can be utilized for the construction of a data driven fuzzy cognitive map. Hence, this implementation is achieved and then the model is thoroughly tested; Qwen2.5-32B is used and the data is extracted from hotel reviews from TripAdvisor. Furthermore, the extracted documents pass through the model unfiltered and then a fuzzy cognitive map is trained and evaluated. A case is made about Greek reviews where a star topology FCM is formed that indicates the preferences of the reviewers. Finally, external validation is performed to establish whether the fuzzy cognitive map can correlate the star rating of the review -an outcome outside the model's inference scope -with its predicted satisfaction.
- LLM for the development of FCM arXiv - Computer Science: Machine Learning
- LLM for the development of FCM arXiv - cs.LG
- LLM for the development of FCM arXiv - cs.AI
arXiv - Computer Science: Artificial Intelligence arxiv.org ai arxiv computer-science preprint research science 2026-07-10 04:00

arXiv:2607.02885v2 Announce Type: replace-cross Abstract: Cognitive Behavioral Therapy (CBT) provides a structured framework for understanding a user's mental state by examining the interaction between cognitive and behavioral factors. However, out-of-the-box LLMs respond...

arXiv:2607.02885v2 Announce Type: replace-cross Abstract: Cognitive Behavioral Therapy (CBT) provides a structured framework for understanding a user's mental state by examining the interaction between cognitive and behavioral factors. However, out-of-the-box LLMs respond fluently and empathetically, yet collapse into validation & reflection, regardless of what the user actually needs. They know theoretical CBT (scoring up to 96% accuracy on licensing exam questions) but fail to apply it effectively. We explore this gap with a knowledge-guided framework that treats CBT dialogue as controlled affective reasoning: user narratives are decomposed into Beck's Cognitive Conceptualization structure, grounded in clinical SNOMED CT concepts validated via Natural Language Inference, and a Multiple Chain-of-Thought (MCoT) strategy selection between Validation & Reflection, Socratic Questioning, or Alternative Perspectives. To measure whether such guidance actually changes behavior, we introduce the Protocol Leverage Force (F), a behavior-level metric that captures how far an intervention shifts a model away from its default response. Across three open-weight LLMs and 14 RealCBT-derived case studies, evaluated with human experts, valence-arousal trajectories, and linguistic entrainment, F shows that simply introducing protocol definitions via single chain-of-thought prompting fails to change LLM behavior, while MCoT on these definitions guides strategy selection better. Still, the effect stays within 1% (approx. 1.2-1.3%), and all models remain biased toward Validation & Reflection. These results show CBT knowledge alone does not ensure effective application, giving the affective-computing community instrumentation to measure where LLMs fall short.
- Where do LLMs Fall Short in CBT-Guided Affective Reasoning? arXiv - cs.AI
End of feed