Best AI papers explained

Podcast készítő Enoch H. Kang

550 Epizód

Past-Token Prediction for Long-Context Robot Policies
Közzétéve: 2025. 05. 20.
Recovering Coherent Event Probabilities from LLM Embeddings
Közzétéve: 2025. 05. 20.
Systematic Meta-Abilities Alignment in Large Reasoning Models
Közzétéve: 2025. 05. 20.
Predictability Shapes Adaptation: An Evolutionary Perspective on Modes of Learning in Transformers
Közzétéve: 2025. 05. 20.
Efficient Exploration for LLMs
Közzétéve: 2025. 05. 19.
Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation
Közzétéve: 2025. 05. 18.
Bayesian Concept Bottlenecks with LLM Priors
Közzétéve: 2025. 05. 17.
Transformers for In-Context Reinforcement Learning
Közzétéve: 2025. 05. 17.
Evaluating Large Language Models Across the Lifecycle
Közzétéve: 2025. 05. 17.
Active Ranking from Human Feedback with DopeWolfe
Közzétéve: 2025. 05. 16.
Optimal Designs for Preference Elicitation
Közzétéve: 2025. 05. 16.
Dual Active Learning for Reinforcement Learning from Human Feedback
Közzétéve: 2025. 05. 16.
Active Learning for Direct Preference Optimization
Közzétéve: 2025. 05. 16.
Active Preference Optimization for RLHF
Közzétéve: 2025. 05. 16.
Test-Time Alignment of Diffusion Models without reward over-optimization
Közzétéve: 2025. 05. 16.
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback
Közzétéve: 2025. 05. 16.
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Közzétéve: 2025. 05. 16.
Advantage-Weighted Regression: Simple and Scalable Off-Policy RL
Közzétéve: 2025. 05. 16.
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Közzétéve: 2025. 05. 16.
Transformers can be used for in-context linear regression in the presence of endogeneity
Közzétéve: 2025. 05. 15.

18 / 28

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Visit the podcast's native language site

550 Epizód

Past-Token Prediction for Long-Context Robot Policies

Recovering Coherent Event Probabilities from LLM Embeddings

Systematic Meta-Abilities Alignment in Large Reasoning Models

Predictability Shapes Adaptation: An Evolutionary Perspective on Modes of Learning in Transformers

Efficient Exploration for LLMs

Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation

Bayesian Concept Bottlenecks with LLM Priors

Transformers for In-Context Reinforcement Learning

Evaluating Large Language Models Across the Lifecycle

Active Ranking from Human Feedback with DopeWolfe

Optimal Designs for Preference Elicitation

Dual Active Learning for Reinforcement Learning from Human Feedback

Active Learning for Direct Preference Optimization

Active Preference Optimization for RLHF

Test-Time Alignment of Diffusion Models without reward over-optimization

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment

Advantage-Weighted Regression: Simple and Scalable Off-Policy RL

Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective

Transformers can be used for in-context linear regression in the presence of endogeneity