KOINEU Logo
๋‘ ๋‹จ๊ณ„ ์ž๊ธฐ์ง€๋„ ํ•™์Šต์œผ๋กœ ๊ตฌํ˜„ํ•œ ๊ณ ํšจ์œจ ์Œ์„ฑ ํ‘œํ˜„ ๋ฐ ์••์ถ• ํ”„๋ ˆ์ž„์›Œํฌ

๋‘ ๋‹จ๊ณ„ ์ž๊ธฐ์ง€๋„ ํ•™์Šต์œผ๋กœ ๊ตฌํ˜„ํ•œ ๊ณ ํšจ์œจ ์Œ์„ฑ ํ‘œํ˜„ ๋ฐ ์••์ถ• ํ”„๋ ˆ์ž„์›Œํฌ

We introduce a two-stage self-supervised framework that combines the Joint-Embedding Predictive Architecture (JEPA) with a Density Adaptive Attention Mechanism (DAAM) for learning robust speech representations. Stage 1 uses JEPA with DAAM to learn semantic audio features via masked prediction in lat

๋“€์–ผ๊ฒŒ์ด์ง€ LLM ๊ธฐ๋ฐ˜ ์ฝ”๋“œ ์ƒ์„ฑ ๋ณด์•ˆ๊ณผ ์ •ํ™•์„ฑ ๋™์‹œ ํ‰๊ฐ€ ์ž๋™ ๋ฒค์น˜๋งˆํฌ ํ”„๋ ˆ์ž„์›Œํฌ

๋“€์–ผ๊ฒŒ์ด์ง€ LLM ๊ธฐ๋ฐ˜ ์ฝ”๋“œ ์ƒ์„ฑ ๋ณด์•ˆ๊ณผ ์ •ํ™•์„ฑ ๋™์‹œ ํ‰๊ฐ€ ์ž๋™ ๋ฒค์น˜๋งˆํฌ ํ”„๋ ˆ์ž„์›Œํฌ

Large language models (LLMs) and autonomous coding agents are increasingly used to generate software across a wide range of domains. Yet a core requirement remains unmet: ensuring that generated code is secure without compromising its functional correctness. Existing benchmarks and evaluations for s

๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์˜คํ† ์ธ์ฝ”๋”์˜ ๋ฆฌํ”„์‹œ์ธ  ํŠน์„ฑ ๋ถ„์„๊ณผ ์ฃผ์˜ ๊ธฐ๋ฐ˜ ์œตํ•ฉ ์•ˆ์ •ํ™” ๊ธฐ๋ฒ•

๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์˜คํ† ์ธ์ฝ”๋”์˜ ๋ฆฌํ”„์‹œ์ธ  ํŠน์„ฑ ๋ถ„์„๊ณผ ์ฃผ์˜ ๊ธฐ๋ฐ˜ ์œตํ•ฉ ์•ˆ์ •ํ™” ๊ธฐ๋ฒ•

In recent years, the development of multimodal autoencoders has gained significant attention due to their potential to handle multimodal complex data types and improve model performance. Understanding the stability and robustness of these models is crucial for optimizing their training, architecture

๋ชจ๋‚˜๋”• ์ปจํ…์ŠคํŠธ ์—”์ง€๋‹ˆ์–ด๋ง: ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ ์—์ด์ „ํŠธ ์„ค๊ณ„์˜ ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„

๋ชจ๋‚˜๋”• ์ปจํ…์ŠคํŠธ ์—”์ง€๋‹ˆ์–ด๋ง: ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ ์—์ด์ „ํŠธ ์„ค๊ณ„์˜ ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„

The proliferation of Large Language Models (LLMs) has catalyzed a shift towards autonomous agents capable of complex reasoning and tool use. However, current agent architectures are frequently constructed using imperative, ad hoc patterns. This results in brittle systems plagued by difficulties in s

๋ฌด์„ ์ฃผํŒŒ์ˆ˜ ๋ผ๋””์–ธ์Šคํ•„๋“œ ๊ธฐ๋ฐ˜ ์‚ฌ์ „ํ•™์Šต์œผ๋กœ ์‹ค๋‚ด ์œ„์น˜์ถ”์ • ์ผ๋ฐ˜ํ™” ํ˜์‹ 

๋ฌด์„ ์ฃผํŒŒ์ˆ˜ ๋ผ๋””์–ธ์Šคํ•„๋“œ ๊ธฐ๋ฐ˜ ์‚ฌ์ „ํ•™์Šต์œผ๋กœ ์‹ค๋‚ด ์œ„์น˜์ถ”์ • ์ผ๋ฐ˜ํ™” ํ˜์‹ 

Radio frequency (RF)-based indoor localization offers significant promise for applications such as indoor navigation, augmented reality, and pervasive computing. While deep learning has greatly enhanced localization accuracy and robustness, existing localization models still face major challenges in

๋ฌผ๋ฆฌํ•™์—์„œ ๊ฒฐ์ •๋ก ๊ณผ ๋น„๊ฒฐ์ •๋ก ์˜ ํ‘œ์ƒ์  ๋Œ€๋ฆฝ๊ณผ ๋ชจ๋ธ ๋ถˆ๋ณ€์„ฑ ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ ์‹ค์žฌ๋ก 

๋ฌผ๋ฆฌํ•™์—์„œ ๊ฒฐ์ •๋ก ๊ณผ ๋น„๊ฒฐ์ •๋ก ์˜ ํ‘œ์ƒ์  ๋Œ€๋ฆฝ๊ณผ ๋ชจ๋ธ ๋ถˆ๋ณ€์„ฑ ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ ์‹ค์žฌ๋ก 

This paper argues that the traditional opposition between determinism and indeterminism in physics is representational rather than ontological. Deterministic-stochastic dualities are available in principle, and arise in a non-contrived way in many scientifically important models. When dynamical syst

๋ฒ•๋ฅ  ๋ถ„์•ผ LLM ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ ๋ฌธ์„œ ๊ตฌ์กฐ ์žฌ๋ฐฐ์น˜์™€ ์—ญํ•  ๊ธฐ๋ฐ˜ ํ”„๋กฌํ”„ํŠธ ์—ฐ๊ตฌ

๋ฒ•๋ฅ  ๋ถ„์•ผ LLM ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ ๋ฌธ์„œ ๊ตฌ์กฐ ์žฌ๋ฐฐ์น˜์™€ ์—ญํ•  ๊ธฐ๋ฐ˜ ํ”„๋กฌํ”„ํŠธ ์—ฐ๊ตฌ

Large Language Models (LLMs), trained on extensive datasets from the web, exhibit remarkable general reasoning skills. Despite this, they often struggle in specialized areas like law, mainly because they lack domain-specific pretraining. The legal field presents unique challenges, as legal documents

๋ณ€ํ˜• ๋ง๋ฒ  ๊ธฐ๋ฐ˜ ๊ธ€๋กœ๋ฒŒ ์ปจํ…์ŠคํŠธ ํ•™์Šต์„ ํ†ตํ•œ 3D ์† ์ž์„ธ ์ถ”์ •

๋ณ€ํ˜• ๋ง๋ฒ  ๊ธฐ๋ฐ˜ ๊ธ€๋กœ๋ฒŒ ์ปจํ…์ŠคํŠธ ํ•™์Šต์„ ํ†ตํ•œ 3D ์† ์ž์„ธ ์ถ”์ •

Modeling daily hand interactions often struggles with severe occlusions, such as when two hands overlap, which highlights the need for robust feature learning in 3D hand pose estimation (HPE). To handle such occluded hand images, it is vital to effectively learn the relationship between local image

๋ณ€ํ˜• ํŠธ๋žœ์Šคํฌ๋จธ ์ •์ฑ…์„ ์œ„ํ•œ ์ผ๋ฐ˜ํ™” ์ •์ฑ… ๊ทธ๋ž˜๋””์–ธํŠธ ์ •๋ฆฌ

๋ณ€ํ˜• ํŠธ๋žœ์Šคํฌ๋จธ ์ •์ฑ…์„ ์œ„ํ•œ ์ผ๋ฐ˜ํ™” ์ •์ฑ… ๊ทธ๋ž˜๋””์–ธํŠธ ์ •๋ฆฌ

We present the Generalized Policy Gradient (GPG) Theorem, specifically designed for Transformer-based policies. Notably, we demonstrate that both standard Policy Gradient Theorem and GRPO emerge as special cases within our GPG framework. Furthermore, we explore its practical applications in training

๋ณ‘๋ ฌ ํ† ํฐ ์ƒ์„ฑ ์œ„ํ•œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ๋งˆ์Šคํฌ ํ™•์‚ฐ ์–ธ์–ด ๋ชจ๋ธ ๊ฐ€์†๊ธฐ dUltra

๋ณ‘๋ ฌ ํ† ํฐ ์ƒ์„ฑ ์œ„ํ•œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ๋งˆ์Šคํฌ ํ™•์‚ฐ ์–ธ์–ด ๋ชจ๋ธ ๊ฐ€์†๊ธฐ dUltra

Masked diffusion language models (MDLMs) offer the potential for parallel token generation, but most open-source MDLMs decode fewer than 5 tokens per model forward pass even with sophisticated sampling strategies. As a result, their sampling speeds are often comparable to AR + speculative decoding s

๋น„์ •์ƒ ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์˜ˆ์ธก ๊ธฐ๋ฐ˜ ์˜คํ”„๋ผ์ธ ๊ฐ•ํ™”ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ

๋น„์ •์ƒ ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์˜ˆ์ธก ๊ธฐ๋ฐ˜ ์˜คํ”„๋ผ์ธ ๊ฐ•ํ™”ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ

Offline Reinforcement Learning (RL) provides a promising avenue for training policies from pre-collected datasets when gathering additional interaction data is infeasible. However, existing offline RL methods often assume stationarity or only consider synthetic perturbations at test time, assumption

์ƒ์„ฑํ˜• AI๊ฐ€ ๊ธˆ์œต ์• ๋„๋ฆฌ์ŠคํŠธ ๋ณด๊ณ ์„œ์— ๋ฏธ์น˜๋Š” ์ƒ์‚ฐ์„ฑยท์ •ํ™•๋„ ์–‘๋ฉด ํšจ๊ณผ

์ƒ์„ฑํ˜• AI๊ฐ€ ๊ธˆ์œต ์• ๋„๋ฆฌ์ŠคํŠธ ๋ณด๊ณ ์„œ์— ๋ฏธ์น˜๋Š” ์ƒ์‚ฐ์„ฑยท์ •ํ™•๋„ ์–‘๋ฉด ํšจ๊ณผ

We study how generative artificial intelligence (AI) transforms the work of financial analysts. Using the 2023 launch of FactSet's AI platform as a natural experiment, we find that adoption produces markedly richer and more comprehensive reports-featuring 40% more distinct information sources, 34% b

์ƒ์„ฑํ˜• ๊ฒ€์ƒ‰์—์„œ ๊ณต์ •ํ•œ ๊ธฐ์—ฌ๋„ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ MAXSHAPLEY ์•Œ๊ณ ๋ฆฌ์ฆ˜

์ƒ์„ฑํ˜• ๊ฒ€์ƒ‰์—์„œ ๊ณต์ •ํ•œ ๊ธฐ์—ฌ๋„ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ MAXSHAPLEY ์•Œ๊ณ ๋ฆฌ์ฆ˜

Generative search engines based on large language models (LLMs) are replacing traditional search, fundamentally changing how information providers are compensated. To sustain this ecosystem, we need fair mechanisms to attribute and compensate content providers based on their contributions to generat

์Šค๋งˆํŠธ ํ™ˆ ๊ธฐ๋ฐ˜ ์š”๋กœ๊ฐ์—ผ ์กฐ๊ธฐ ํƒ์ง€๋ฅผ ์œ„ํ•œ ๋ถˆํ™•์‹ค์„ฑ ์ธ์‹ ์ž„์ƒ ์ง€์› ์‹œ์Šคํ…œ

์Šค๋งˆํŠธ ํ™ˆ ๊ธฐ๋ฐ˜ ์š”๋กœ๊ฐ์—ผ ์กฐ๊ธฐ ํƒ์ง€๋ฅผ ์œ„ํ•œ ๋ถˆํ™•์‹ค์„ฑ ์ธ์‹ ์ž„์ƒ ์ง€์› ์‹œ์Šคํ…œ

Urinary tract infection (UTI) flare-ups pose a significant health risk for older adults with chronic conditions. These infections often go unnoticed until they become severe, making early detection through innovative smart home technologies crucial. Traditional machine learning (ML) approaches relyi

์Šค์ผ€์ผ๋ง ์ทจ์•ฝ์ ์„ ์ด์šฉํ•œ ์ ์‘ํ˜• ์‹œ๊ฐโ€‘์–ธ์–ด ๋ชจ๋ธ ๊ณต๊ฒฉ ํ”„๋ ˆ์ž„์›Œํฌ

์Šค์ผ€์ผ๋ง ์ทจ์•ฝ์ ์„ ์ด์šฉํ•œ ์ ์‘ํ˜• ์‹œ๊ฐโ€‘์–ธ์–ด ๋ชจ๋ธ ๊ณต๊ฒฉ ํ”„๋ ˆ์ž„์›Œํฌ

Multimodal Artificial Intelligence (AI) systems, particularly Vision-Language Models (VLMs), have become integral to critical applications ranging from autonomous decisionmaking to automated document processing. As these systems scale, they rely heavily on preprocessing pipelines to handle diverse i

์‹œ๊ฐโ€‘์–ธ์–ด ๋ชจ๋ธ ํ…์ŠคํŠธ ๊ด€์„ฑ ํ•ด์†Œ๋ฅผ ์œ„ํ•œ ์˜์‹์  ์‹œ์„  ์ œ์–ด

์‹œ๊ฐโ€‘์–ธ์–ด ๋ชจ๋ธ ํ…์ŠคํŠธ ๊ด€์„ฑ ํ•ด์†Œ๋ฅผ ์œ„ํ•œ ์˜์‹์  ์‹œ์„  ์ œ์–ด

Large Vision-Language Models (VLMs) often exhibit text inertia, where attention drifts from visual evidence toward linguistic priors, resulting in object hallucinations. Existing decoding strategies intervene only at the output logits and thus cannot correct internal reasoning drift, while recent in

์‹œ๊ฐ„ ์‹œ๊ณ„์—ด ๊ธฐ๋ฐ˜ ๋ชจ๋ธ ํˆดํ‚ท์œผ๋กœ ํ˜์‹ ์ ์ธ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ•

์‹œ๊ฐ„ ์‹œ๊ณ„์—ด ๊ธฐ๋ฐ˜ ๋ชจ๋ธ ํˆดํ‚ท์œผ๋กœ ํ˜์‹ ์ ์ธ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ•

Foundation models (FMs) have opened new avenues for machine learning applications due to their ability to adapt to new and unseen tasks with minimal or no further training. Time-series foundation models (TSFMs)-FMs trained on time-series data-have shown strong performance on classification, regressi

์‹œ์  ๋ณ€ํ™”์™€ ์›€์ง์ด๋Š” ์Œ์›์— ๋Œ€์‘ํ•˜๋Š” ๊ณ ํ’ˆ์งˆ ๋ฐ”์ด๋…ธ๋Ÿด ์˜ค๋””์˜ค ViSAudio

์‹œ์  ๋ณ€ํ™”์™€ ์›€์ง์ด๋Š” ์Œ์›์— ๋Œ€์‘ํ•˜๋Š” ๊ณ ํ’ˆ์งˆ ๋ฐ”์ด๋…ธ๋Ÿด ์˜ค๋””์˜ค ViSAudio

Comprehensive experiments demonstrate that ViSAudio outperforms existing state-of-the-art methods across both objective metrics and subjective evaluations, generating high-quality binaural audio with spatial immersion that adapts effectively to viewpoint changes, sound-source motion, and diverse aco

์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ์„ ์œ„ํ•œ 4D ๊ฐ€์šฐ์‹œ์•ˆ ์Šคํ”Œ๋ž˜ํŒ… ์ตœ์ ํ™” ํ”„๋ ˆ์ž„์›Œํฌ AirGS

์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ์„ ์œ„ํ•œ 4D ๊ฐ€์šฐ์‹œ์•ˆ ์Šคํ”Œ๋ž˜ํŒ… ์ตœ์ ํ™” ํ”„๋ ˆ์ž„์›Œํฌ AirGS

Free-viewpoint video (FVV) enables immersive viewing experiences by allowing users to view scenes from arbitrary perspectives. As a prominent reconstruction technique for FVV generation, 4D Gaussian Splatting (4DGS) models dynamic scenes with time-varying 3D Gaussian ellipsoids and achieves high-qua

์‹ค์ œ ์„ธ๊ณ„์™€ ๊ฐ™์€ ๋ณตํ•ฉ ํ™˜๊ฒฝ์„ ์œ„ํ•œ LLM VLM ์—์ด์ „ํŠธ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ SimWorld

์‹ค์ œ ์„ธ๊ณ„์™€ ๊ฐ™์€ ๋ณตํ•ฉ ํ™˜๊ฒฝ์„ ์œ„ํ•œ LLM VLM ์—์ด์ „ํŠธ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ SimWorld

While LLM/VLM-powered AI agents have advanced rapidly in math, coding, and computer use, their applications in complex physical and social environments remain challenging. Building agents that can survive and thrive in the real world (e.g., by autonomously earning income or running a business) requi

์‹ฌ๋ณผ๋ฆญ ๋“œ๋ผ์ด๋ธŒ ๋กœ์ปฌ ํผ์ŠคํŠธ ์ž์œจ์ฃผํ–‰ ๋ฐ์ดํ„ฐ ๋งˆ์ด๋‹ ํ”„๋ ˆ์ž„์›Œํฌ

์‹ฌ๋ณผ๋ฆญ ๋“œ๋ผ์ด๋ธŒ ๋กœ์ปฌ ํผ์ŠคํŠธ ์ž์œจ์ฃผํ–‰ ๋ฐ์ดํ„ฐ ๋งˆ์ด๋‹ ํ”„๋ ˆ์ž„์›Œํฌ

The development of robust Autonomous Vehicles (AVs) is bottlenecked by the scarcity of 'Long-Tail' training data. While fleets collect petabytes of video logs, identifying rare safety-critical events (e.g., erratic jaywalking, construction diversions) remains a manual, cost-prohibitive process. Exis

์—”ํŠธ๋กœํ”ผ ์‹ ํ˜ธ ๊ธฐ๋ฐ˜ ํšจ์œจ์  ๊ฐ•ํ™”ํ•™์Šต์œผ๋กœ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ ์ถ”๋ก  ํ–ฅ์ƒ

์—”ํŠธ๋กœํ”ผ ์‹ ํ˜ธ ๊ธฐ๋ฐ˜ ํšจ์œจ์  ๊ฐ•ํ™”ํ•™์Šต์œผ๋กœ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ ์ถ”๋ก  ํ–ฅ์ƒ

Reinforcement learning with verifiable rewards (RLVR) has demonstrated superior performance in enhancing the reasoning capability of large language models (LLMs). However, this accuracy-oriented learning paradigm often suffers from entropy collapse, which reduces policy exploration and limits reason

์˜ˆ์‚ฐ ์ œ์•ฝ ํ•˜ ๋น„์šฉ ํšจ์œจ์ ์ธ ๋‹ค์ค‘ ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ ์„ค๊ณ„์™€ AgentBalance ํ”„๋ ˆ์ž„์›Œํฌ

์˜ˆ์‚ฐ ์ œ์•ฝ ํ•˜ ๋น„์šฉ ํšจ์œจ์ ์ธ ๋‹ค์ค‘ ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ ์„ค๊ณ„์™€ AgentBalance ํ”„๋ ˆ์ž„์›Œํฌ

Large Language Model (LLM)-based multi-agent systems (MAS) have become indispensable building blocks for web-scale applications (e.g., web search, social network analytics, online customer support), with cost-effectiveness becoming the primary constraint on large-scale deployment. While recent advan

์›น์‰˜ ํŒจ๋ฐ€๋ฆฌ ์ž๋™ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ๋™์  ํ˜ธ์ถœ ์ถ”์ ๊ณผ ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ํ‘œํ˜„ ์—ฐ๊ตฌ

์›น์‰˜ ํŒจ๋ฐ€๋ฆฌ ์ž๋™ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ๋™์  ํ˜ธ์ถœ ์ถ”์ ๊ณผ ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ํ‘œํ˜„ ์—ฐ๊ตฌ

Malicious WebShells pose a significant and evolving threat by compromising critical digital infrastructures and endangering public services in sectors such as healthcare and finance. While the research community has made significant progress in WebShell detection (i.e., distinguishing malicious samp

์œ„ํ‚ค๋ฐฑ๊ณผ ๋Œ“๊ธ€ ๋ฌด๋ก€์„ฑ ํƒ์ง€๋ฅผ ์œ„ํ•œ ๊ทธ๋ž˜ํ”„ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ์  ๋ถ„์„

์œ„ํ‚ค๋ฐฑ๊ณผ ๋Œ“๊ธ€ ๋ฌด๋ก€์„ฑ ํƒ์ง€๋ฅผ ์œ„ํ•œ ๊ทธ๋ž˜ํ”„ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ์  ๋ถ„์„

Online incivility has emerged as a widespread and persistent problem in digital communities, imposing substantial social and psychological burdens on users. Although many platforms attempt to curb incivility through moderation and automated detection, the performance of existing approaches often rem

์˜๋ฏธ์ธ์‹ ๊ธฐ๋ฐ˜ ์˜๋ฃŒ ์˜์ƒ ๋ณต์›๊ณผ ๋ธ”๋ก์ฒด์ธ ์ถ”์  ํ†ตํ•ฉ ์‹œ์Šคํ…œ

์˜๋ฏธ์ธ์‹ ๊ธฐ๋ฐ˜ ์˜๋ฃŒ ์˜์ƒ ๋ณต์›๊ณผ ๋ธ”๋ก์ฒด์ธ ์ถ”์  ํ†ตํ•ฉ ์‹œ์Šคํ…œ

Medical imaging is essential for clinical diagnosis, yet realworld data frequently suffers from corruption, noise, and potential tampering, challenging the reliability of AI-assisted interpretation. Conventional reconstruction techniques prioritize pixel-level recovery and may produce visually plaus

์ด๋”๋ฆฌ์›€ ๊ฑฐ๋ž˜ ๊ฒฝ์ œ์  ์˜๋„ ํŒŒ์•…์„ ์œ„ํ•œ TxSum ๋ฐ์ดํ„ฐ์…‹๊ณผ MATEX ๋ฉ€ํ‹ฐ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ

์ด๋”๋ฆฌ์›€ ๊ฑฐ๋ž˜ ๊ฒฝ์ œ์  ์˜๋„ ํŒŒ์•…์„ ์œ„ํ•œ TxSum ๋ฐ์ดํ„ฐ์…‹๊ณผ MATEX ๋ฉ€ํ‹ฐ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ

Understanding the economic intent of Ethereum transactions is critical for user safety, yet current tools expose only raw on-chain data, leading to widespread 'blind signing' (approving transactions without understanding them). Through interviews with 16 Web3 users, we find that effective explanatio

์ด์ค‘ ์ถ”๋ก  ํ•™์Šต: ๊ธ์ •โ€‘๋ถ€์ • ๋…ผ๋ฆฌ๋ฅผ ๊ฒฐํ•ฉํ•œ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ์˜ ๊ณผํ•™์  ์ถ”๋ก  ๊ฐ•ํ™”

์ด์ค‘ ์ถ”๋ก  ํ•™์Šต: ๊ธ์ •โ€‘๋ถ€์ • ๋…ผ๋ฆฌ๋ฅผ ๊ฒฐํ•ฉํ•œ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ์˜ ๊ณผํ•™์  ์ถ”๋ก  ๊ฐ•ํ™”

Large Language Models (LLMs) have transformed natural language processing and hold growing promise for advancing science, healthcare, and decision-making. Yet their training paradigms remain dominated by affirmation-based inference, akin to modus ponens, where accepted premises yield predicted conse

์ž๋™ํ™”๋œ MDP ๋ชจ๋ธ๋ง๊ณผ ์ •์ฑ… ์ƒ์„ฑ์„ ์œ„ํ•œ ์—์ด์ „ํŠธํ˜• LLM ํ”„๋ ˆ์ž„์›Œํฌ Aโ€‘LAMP

์ž๋™ํ™”๋œ MDP ๋ชจ๋ธ๋ง๊ณผ ์ •์ฑ… ์ƒ์„ฑ์„ ์œ„ํ•œ ์—์ด์ „ํŠธํ˜• LLM ํ”„๋ ˆ์ž„์›Œํฌ Aโ€‘LAMP

Applying reinforcement learning (RL) to real-world tasks requires converting informal descriptions into a formal Markov decision process (MDP), implementing an executable environment, and training a policy agent. Automating this process is challenging due to modeling errors, fragile code, and misali

์ €์กฐ๋„ ๊ตํ†ต ์˜์ƒ ํ–ฅ์ƒ์„ ์œ„ํ•œ ๋ฌด์ง€๋„ ํ•™์Šต ๋‹ค๋‹จ๊ณ„ ํ”„๋ ˆ์ž„์›Œํฌ

์ €์กฐ๋„ ๊ตํ†ต ์˜์ƒ ํ–ฅ์ƒ์„ ์œ„ํ•œ ๋ฌด์ง€๋„ ํ•™์Šต ๋‹ค๋‹จ๊ณ„ ํ”„๋ ˆ์ž„์›Œํฌ

Enhancing low-light traffic imagery is a critical requirement for achieving reliable perception in autonomous driving, intelligent transportation, and urban surveillance systems. Traffic scenes captured under nighttime or dimly lit conditions often suffer from complex visual degradations arising fro

์ฃผ๊ฐ€ ์˜ˆ์ธก์—์„œ KAN๊ณผ LSTM ์„ฑ๋Šฅ ๋น„๊ต ์ •ํ™•๋„์™€ ํ•ด์„ ๊ฐ€๋Šฅ์„ฑ์˜ ๊ท ํ˜•

์ฃผ๊ฐ€ ์˜ˆ์ธก์—์„œ KAN๊ณผ LSTM ์„ฑ๋Šฅ ๋น„๊ต ์ •ํ™•๋„์™€ ํ•ด์„ ๊ฐ€๋Šฅ์„ฑ์˜ ๊ท ํ˜•

This paper compares Kolmogorov-Arnold Networks (KAN) and Long Short-Term Memory networks (LSTM) for forecasting non-deterministic stock price data, evaluating predictive accuracy versus interpretability trade-offs using Root Mean Square Error (RMSE). LSTM demonstrates substantial superiority across

ํ๋ธŒ๋ฒค์น˜ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ์˜ ๊ณต๊ฐ„ยท์ˆœ์ฐจ ์ถ”๋ก  ํ‰๊ฐ€

ํ๋ธŒ๋ฒค์น˜ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ์˜ ๊ณต๊ฐ„ยท์ˆœ์ฐจ ์ถ”๋ก  ํ‰๊ฐ€

We introduce Cube Bench, a Rubik's-cube benchmark for evaluating spatial and sequential reasoning in multimodal large language models (MLLMs). The benchmark decomposes performance into five skills: (i) reconstructing cube faces from images and text, (ii) choosing the optimal next move, (iii) predict

ํด๋ฆฐ๋…ธํŠธ์—์ด์ „ํŠธ ๋Œ€ํ˜•์–ธ์–ด๋ชจ๋ธ ๊ธฐ๋ฐ˜ ๋‹ค์ค‘โ€‘์—์ด์ „ํŠธ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ™œ์šฉํ•œ ์‹ฌ๋ถ€์ „ 30์ผ ์žฌ์ž…์› ์œ„ํ—˜ ์˜ˆ์ธก

ํด๋ฆฐ๋…ธํŠธ์—์ด์ „ํŠธ ๋Œ€ํ˜•์–ธ์–ด๋ชจ๋ธ ๊ธฐ๋ฐ˜ ๋‹ค์ค‘โ€‘์—์ด์ „ํŠธ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ™œ์šฉํ•œ ์‹ฌ๋ถ€์ „ 30์ผ ์žฌ์ž…์› ์œ„ํ—˜ ์˜ˆ์ธก

Heart failure (HF) is one of the leading causes of rehospitalization among older adults in the United States. Although clinical notes contain rich, detailed patient information and make up a large portion of electronic health records (EHRs), they remain underutilized for HF readmission risk analysis

ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ํŽธ์ง‘ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์ข…ํ•ฉ ๋ฒค์น˜๋งˆํฌ์™€ ์ธ๊ฐ„ ์ง€๊ฐ์— ๋งž์ถ˜ ๋ฉ”ํŠธ๋ฆญ

ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ํŽธ์ง‘ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์ข…ํ•ฉ ๋ฒค์น˜๋งˆํฌ์™€ ์ธ๊ฐ„ ์ง€๊ฐ์— ๋งž์ถ˜ ๋ฉ”ํŠธ๋ฆญ

Recent advances in text-driven image editing have been significant, yet the task of accurately evaluating these edited images continues to pose a considerable challenge. Different from the assessment of text-driven image generation, text-driven image editing is characterized by simultaneously condit

ํ”„๋ฆฌ์ฆ˜ ์›”๋“œ ๋ชจ๋ธ: ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๋กœ๋ด‡ ๋™์—ญํ•™์„ ์œ„ํ•œ ๋ชจ๋“œ ๋ถ„๋ฆฌ ์ „๋ฌธ๊ฐ€ ํ˜ผํ•ฉ

ํ”„๋ฆฌ์ฆ˜ ์›”๋“œ ๋ชจ๋ธ: ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๋กœ๋ด‡ ๋™์—ญํ•™์„ ์œ„ํ•œ ๋ชจ๋“œ ๋ถ„๋ฆฌ ์ „๋ฌธ๊ฐ€ ํ˜ผํ•ฉ

Model-based planning in robotic domains is fundamentally challenged by the hybrid nature of physical dynamics, where continuous motion is punctuated by discrete events such as contacts and impacts. Conventional latent world models typically employ monolithic neural networks that enforce global conti

ํ”Œ๋ผ์Šคํ‹ฑ์„ฑ ํšŒ๋ณต์„ ์œ„ํ•œ ํŠธ์œˆ ๋„คํŠธ์›Œํฌ ๊ธฐ๋ฐ˜ ๋ฆฌ์…‹ ๊ธฐ๋ฒ• AltNet

ํ”Œ๋ผ์Šคํ‹ฑ์„ฑ ํšŒ๋ณต์„ ์œ„ํ•œ ํŠธ์œˆ ๋„คํŠธ์›Œํฌ ๊ธฐ๋ฐ˜ ๋ฆฌ์…‹ ๊ธฐ๋ฒ• AltNet

Neural networks have shown remarkable success in supervised learning when trained on a single task using a fixed dataset. However, when neural networks are trained on a reinforcement learning task, their ability to continue learning from new experiences declines over time. This decline in learning a

ํ”ฝ์…€ ๋™๋“ฑ ์ž ์žฌ ํ•ฉ์„ฑ์œผ๋กœ ๊ตฌํ˜„ํ•˜๋Š” ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€ ์ธํŽ˜์ธํŒ…

ํ”ฝ์…€ ๋™๋“ฑ ์ž ์žฌ ํ•ฉ์„ฑ์œผ๋กœ ๊ตฌํ˜„ํ•˜๋Š” ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€ ์ธํŽ˜์ธํŒ…

Latent inpainting in diffusion models still relies almost universally on linearly interpolating VAE latents under a downsampled mask. We propose a key principle for compositing image latents: Pixel-Equivalent Latent Compositing (PELC). An equivalent latent compositor should be the same as compositin

< Category Statistics (Total: 5003) >

General Relativity
59
General Research
698
HEP-EX
14
HEP-LAT
8
HEP-PH
63
HEP-TH
68
MATH-PH
82
NUCL-EX
5
NUCL-TH
15
Quantum Physics
57

Start searching

Enter keywords to search articles

โ†‘โ†“
โ†ต
ESC
โŒ˜K Shortcut