Machine Learning

A Generalized UCB Bandit Algorithm for ML-Based Estimators

We present ML-UCB, a generalized upper confidence bound algorithm that integrates arbitrary machine learning models into multi-armed bandit frameworks. A fundamental challenge in deploying sophisticated ML models for sequential decision-making is the lack of tractable concentration inequalities required for principled exploration. We overcome this limitation by directly modeling the learning curve behavior of the underlying estimator. Specifically, assuming the Mean Squared Error decreases as a power law in the number of training samples, we derive a generalized concentration inequality and prove that ML-UCB achieves sublinear regret. This framework enables the principled integration of any ML model whose learning curve can be empirically characterized, eliminating the need for model-specific theoretical analysis. We validate our approach through experiments on a collaborative filtering recommendation system using online matrix factorization with synthetic data designed to simulate a simplified two-tower model, demonstrating substantial improvements over LinUCB

A Generalized UCB Bandit Algorithm for ML-Based Estimators

A Graph-based Framework for Online Time Series Anomaly Detection Using Model Ensemble

Accelerating Storage-Based Training for Graph Neural Networks

Adversarial Instance Generation and Robust Training for Neural Combinatorial Optimization with Multiple Objectives

Attention Needs to Focus A Unified Perspective on Attention Allocation

AutoFed Manual-Free Federated Traffic Prediction via Personalized Prompt

Avatar Forcing Real-Time Interactive Head Avatar Generation for Natural Conversation

BandiK Efficient Multi-Task Decomposition Using a Multi-Bandit Framework

Benchmarking the Computational and Representational Efficiency of State Space Models against Transformers on Long-Context Dyadic Sessions

Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models

Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice

Causify DataFlow A Framework For High-performance Machine Learning Stream Computing

Complexity-based code embeddings

Conformal Prediction Under Distribution Shift A COVID-19 Natural Experiment

Coordinate Matrix Machine A Human-level Concept Learning to Classify Very Similar Documents

Data Complexity-aware Deep Model Performance Forecasting

Data-Driven Assessment of Concrete Mixture Compositions on Chloride Transport via Standalone Machine Learning Algorithms

DatBench Discriminative, Faithful, and Efficient VLM Evaluations

Deep Delta Learning

Deep Networks Learn Deep Hierarchical Models

DéjàQ Open-Ended Evolution of Diverse, Learnable and Verifiable Problems

Digital Twin-Driven Communication-Efficient Federated Anomaly Detection for Industrial IoT

Dynamic Large Concept Models Latent Reasoning in an Adaptive Semantic Space

E-GRPO High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

Empower Low-Altitude Economy Reliability-Aware Dynamic Weighting for Multi-modal UAV Beam Prediction

Entropy-Adaptive Fine-Tuning Resolving Confident Conflicts to Mitigate Forgetting

Evaluating Feature Dependent Noise in Preference-based Reinforcement Learning

FedSCAM Scam-resistant SAM for Robust Federated Optimization in Heterogeneous Environments

Flow Equivariant World Models Memory for Partially Observed Dynamic Environments

Generative Classifiers Avoid Shortcut Solutions

Geometric and Dynamic Scaling in Deep Transformers

Geometric Regularization in Mixture-of-Experts The Disconnect Between Weights and Activations

Geometry of Reason Spectral Signatures of Valid Mathematical Reasoning

HFedMoE Resource-aware Heterogeneous Federated Learning with Mixture-of-Experts

HOLOGRAPH Active Causal Discovery via Sheaf-Theoretic Alignment of Large Language Model Priors

HyperCLOVA X 8B Omni

Interpretability-Guided Bi-objective Optimization Aligning Accuracy and Explainability

IRPO Scaling the Bradley-Terry Model via Reinforcement Learning

Joint Link Adaptation and Device Scheduling Approach for URLLC Industrial IoT Network A DRL-based Method with Bayesian Optimization

LearnAD Learning Interpretable Rules for Brain Networks in Alzheimer s Disease Classification

Learning from Historical Activations in Graph Neural Networks

Length-Aware Adversarial Training for Variable-Length Trajectories Digital Twins for Mall Shopper Paths

LION-DG Layer-Informed Initialization with Deep Gradient Protocols for Accelerated Neural Network Training

LOFA Online Influence Maximization under Full-Bandit Feedback using Lazy Forward Selection

Mental Game Predicting Personality-Job Fit for Software Developers Using Multi-Genre Games and Machine Learning

MODE Efficient Time Series Prediction with Mamba Enhanced by Low-Rank Neural ODEs

More Than Bits Multi-Envelope Double Binary Factorization for Extreme Quantization

MSACL Multi-Step Actor-Critic Learning with Lyapunov Certificates for Exponentially Stabilizing Control

Multimodal Functional Maximum Correlation for Emotion Recognition

Neural Chains and Discrete Dynamical Systems

Optimizing LSTM Neural Networks for Resource-Constrained Retail Sales Forecasting A Model Compression Study

Output Embedding Centering for Stable LLM Pretraining

Path Integral Solution for Dissipative Generative Dynamics

Practical Geometric and Quantum Kernel Methods for Predicting Skeletal Muscle Outcomes in chronic obstructive pulmonary disease

REE-TTT Highly Adaptive Radar Echo Extrapolation Based on Test-Time Training

Refinement Provenance Inference Detecting LLM-Refined Training Prompts from Model Behavior

Robust and Efficient Zeroth-Order LLM Fine-Tuning via Adaptive Bayesian Subspace Optimizer

Safety at One Shot Patching Fine-Tuned LLMs with A Single Instance

Scale-Adaptive Multi-task Power Flow Analysis with Local Topology Slicing and Multi-Task Graph Learning

Semi-overlapping Multi-bandit Best Arm Identification for Sequential Support Network Learning

SmartFlow Reinforcement Learning and Agentic AI for Bike-Sharing Optimisation

Sparse Threats, Focused Defense Criticality-Aware Robust Reinforcement Learning for Safe Autonomous Driving

SPoRC-VIST A Benchmark for Evaluating Generative Natural Narrative in Vision-Language Models

Stronger Approximation Guarantees for Non-Monotone γ-Weakly DR-Submodular Maximization

The Two-Stage Decision-Sampling Hypothesis Understanding the Emergence of Self-Reflection in RL-Trained LLMs

Theoretical Convergence of SMOTE-Generated Samples

Tubular Riemannian Laplace Approximations for Bayesian Neural Networks

Value-guided action planning with JEPA world models

Warp-Cortex An Asynchronous, Memory-Efficient Architecture for Million-Agent Cognitive Scaling on Consumer Hardware

Wittgenstein s Family Resemblance Clustering Algorithm

< Category Statistics (Total: 301) >

Start searching

No results found