KOINEU

DEEPAMBIGQA: Ambiguous Multi-hop Questions for Benchmarking LLM Answer Completeness

1. 연구 배경 및 필요성 멀티홉 추론 과 명칭·제목 모호성 은 실제 검색·QA 환경에서 빈번히 마주치는 문제이다. 기존 SQuAD, HotpotQA, AmbigQA 등은 각각 멀티홉 혹은 모호성에 초점을 맞추지만, 두 요소를 동시에 요구하는 질문은 거의 없었다. LLM+검색 파이프라인(예: ReAct, Self‑RAG)은 “검색 → 추론 → 답변” 순환을 통해 성능을 끌어올리지만, 답변 집합의 완전성(completeness) 을 평가할 메트릭이 부족했다. 2. 데이터 생성 파이프라인 – DeepAmbigQAGen | 단계 | 핵심

DEEPAMBIGQA: Ambiguous Multi-hop Questions for Benchmarking LLM Answer Completeness

Deformation and orientation of a capsule with viscosity contrast in linear flows: a theoretical study

Distributional Deep Learning for Super-Resolution of 4D Flow MRI under Domain Shift

EdgeRunner 20B: Military Task Parity with GPT-5 while Running on the Edge

Efficient LLM Safety Evaluation through Multi-Agent Debate

Estimation of Conformal Metrics

EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning

EvtSlowTV -- A Large and Diverse Dataset for Event-Based Depth Estimation

Exploring Federated Learning for Thermal Urban Feature Segmentation -- A Comparison of Centralized and Decentralized Approaches

Fast algorithms enabling optimization and deep learning for photoacoustic tomography in a circular detection geometry

Fast Ewald Summation using Prolate Spheroidal Wave Functions

Fault Detection in Electrical Distribution System using Autoencoders

FedPoP: Federated Learning Meets Proof of Participation

FiCABU: A Fisher-Based, Context-Adaptive Machine Unlearning Processor for Edge AI

Finding Molecules with Specific Properties: Simulated Annealing vs. Evolution

Finite elements for the space approximation of a differential model for salts crystallization

Focused Relative Risk Information Criterion for Variable Selection in Linear Regression

FP8-Flow-MoE: A Casting-Free FP8 Recipe without Double Quantization Error

From Model Training to Model Raising

From Theory to Throughput: CUDA-Optimized APML for Large-Batch 3D Learning

Functional Decomposition and Shapley Interactions for Interpreting Survival Models

GAIA: Geothermal Analytics and Intelligent Agent

Generalized bilinear Koopman realization from input-output data for multi-step prediction with metaheuristic optimization of lifting function and its application to real-world industrial system

Generalized Leverage Score for Scalable Assessment of Privacy Vulnerability

Generating quantum entanglement from sunlight

Grouping Nodes With Known Value Differences: A Lossless UCT-based Abstraction Algorithm

Higher-Order Hit-&-Run Samplers for Linearly Constrained Densities

History-Aware Reasoning for GUI Agents

How Well Do Large-Scale Chemical Language Models Transfer to Downstream Tasks?

Humanlike AI Design Increases Anthropomorphism but Yields Divergent Outcomes on Engagement and Trust Globally

Improved Accuracy of Robot Localization Using 3-D LiDAR in a Hippocampus-Inspired Model

Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies

KFCPO: Kronecker-Factored Approximated Constrained Policy Optimization

Laboratory observation of collective beam-plasma instabilities in a relativistic pair jet

Learning General Policies with Policy Gradient Methods

Learning Low Rank Neural Representations of Hyperbolic Wave Dynamics from Data

Leveraging Generic Time Series Foundation Models for EEG Classification

Lifecycle-Aware code generation: Leveraging Software Engineering Phases in LLMs

Lightning Grasp: High Performance Procedural Grasp Synthesis with Contact Fields

LLM-based Behaviour Driven Development for Hardware Design

LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer

MAGNET: A Multi-Graph Attentional Network for Code Clone Detection

Memristive tabular variational autoencoder for compression of analog data in high energy physics

Microscopic Rydberg electron orbit manipulation with optical tweezers

Mixture of Attention Schemes (MoAS): Learning to Route Between MHA, GQA, and MQA

Multi-Agent Reinforcement Learning for Market Making: Competition without Collusion

Multi-Modal Feature Fusion for Spatial Morphology Analysis of Traditional Villages via Hierarchical Graph Neural Networks

Multiscale Astrocyte Network Calcium Dynamics for Biologically Plausible Intelligence in Anomaly Detection

Neural Green's Functions

Nonlinear Frequency Shifts due to Phase Coherent Interactions in Incompressible Hall MHD Turbulence

< Category Statistics (Total: 5051) >

Start searching

No results found