Best AI papers explained

Podcast készítő Enoch H. Kang

550 Epizód

Learning dynamics of LLM finetuning
Közzétéve: 2025. 10. 09.
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Közzétéve: 2025. 10. 09.
OpenAI Agent Builder and n8n: Orchestrating Reasoning Versus Automating Process
Közzétéve: 2025. 10. 08.
Training Agents Inside of Scalable World Models
Közzétéve: 2025. 10. 08.
Small Language Models are the Future of Agentic AI
Közzétéve: 2025. 10. 07.
Activation Steering in Generative Settings via Contrastive Causal Mediation Analysis
Közzétéve: 2025. 10. 06.
Eliciting Secret Knowledge from Language Models
Közzétéve: 2025. 10. 06.
Temporal difference flow
Közzétéve: 2025. 10. 06.
Personalized reasoning: just-in-time personalization and why LLMs fail at it
Közzétéve: 2025. 10. 05.
Prompt Curriculum Learning for Efficient LLM Post-Training
Közzétéve: 2025. 10. 05.
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
Közzétéve: 2025. 10. 04.
Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Közzétéve: 2025. 10. 04.
Learning to summarize user information for personalized reinforcement learning from human feedback
Közzétéve: 2025. 10. 04.
Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF
Közzétéve: 2025. 10. 03.
LIMI: Less is More for Agency
Közzétéve: 2025. 10. 01.
LoRA Without Regret
Közzétéve: 2025. 10. 01.
Actor-Critic without Actor: Critic-Guided Denoising for RL
Közzétéve: 2025. 09. 29.
DELTA-Code: How Does RL Unlock and Transfer New Programming Algorithms in LLMs?
Közzétéve: 2025. 09. 29.
Linear Transformers Implicitly Discover Unified Numerical Algorithms
Közzétéve: 2025. 09. 29.
Regularizing Extrapolation in Causal Inference
Közzétéve: 2025. 09. 27.

5 / 28

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Visit the podcast's native language site

550 Epizód

Learning dynamics of LLM finetuning

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF

OpenAI Agent Builder and n8n: Orchestrating Reasoning Versus Automating Process

Training Agents Inside of Scalable World Models

Small Language Models are the Future of Agentic AI

Activation Steering in Generative Settings via Contrastive Causal Mediation Analysis

Eliciting Secret Knowledge from Language Models

Temporal difference flow

Personalized reasoning: just-in-time personalization and why LLMs fail at it

Prompt Curriculum Learning for Efficient LLM Post-Training

Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

Learning to summarize user information for personalized reinforcement learning from human feedback

Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF

LIMI: Less is More for Agency

LoRA Without Regret

Actor-Critic without Actor: Critic-Guided Denoising for RL

DELTA-Code: How Does RL Unlock and Transfer New Programming Algorithms in LLMs?

Linear Transformers Implicitly Discover Unified Numerical Algorithms

Regularizing Extrapolation in Causal Inference