Yannic Kilcher - maiweb

Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2026-03-06 22:07

↗

All information about GTC and the DGX Spark Raffle is here: https://www.ykilcher.com/gtc Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord:...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-12-29 03:44

↗

Letsgooo

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-12-28 12:41

↗

https://ykilcher.com/discord Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute:...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-12-27 14:33

↗

Paper: https://arxiv.org/abs/2511.08923 Abstract: Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language modeling. This raises a...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-12-14 16:22

↗

Paper: https://arxiv.org/abs/2501.00663 Abstract: Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state),...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-11-01 17:39

↗

https://arxiv.org/abs/2510.17558 Abstract: We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-10-19 10:58

↗

Theo's Video: https://www.youtube.com/watch?v=bAYZjVAodoo Cloudflare article: https://blog.cloudflare.com/code-mode/ Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-10-11 16:07

↗

Paper: https://arxiv.org/abs/2508.21038 Abstract: Vector embeddings have been tasked with an ever-increasing set of retrieval tasks over the years, with a nascent rise in using them for reasoning, instruction-following, coding, and more. These new benchmarks push embeddings...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-08-09 10:39

↗

jack Morris's investigation into GPT-OSS training data https://x.com/jxmnop/status/1953899426075816164?t=3YRhVQDwQLk2gouTSACoqA&s=09

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-07-23 11:10

↗

Paper: https://research.trychroma.com/context-rot Abstract: Large Language Models (LLMs) are typically presumed to process context uniformly—that is, the model should handle the 10,000th token just as reliably as the 100th. However, in practice, this assumption does not hold....

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-07-19 15:19

↗

Paper: https://arxiv.org/abs/2507.02092 Code: https://github.com/alexiglad/EBT Website: https://energy-based-transformers.github.io/ Abstract: Inference-time computation techniques, analogous to human System 2 Thinking, have recently become popular for improving model...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-05-03 16:16

↗

An in-depth look at Anthropic's Transformer Circuit Blog Post Part 1 here: https://youtu.be/mU3g2YPKlsA Discord here: https;//ykilcher.com/discord https://transformer-circuits.pub/2025/attribution-graphs/biology.html Abstract: We investigate the internal mechanisms used by...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-04-05 16:17

↗

An in-depth look at Anthropic's Transformer Circuit Blog Post https://transformer-circuits.pub/2025/attribution-graphs/biology.html Abstract: We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts,...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2025-01-26 14:03

↗

#deepseek #llm #grpo GRPO is one of the core advancements used in Deepseek-R1, but was introduced already last year in this paper that uses a combination of new RL techniques and iterative data collection to achieve remarkable performance on mathematics benchmarks with just a...

Watch on YouTube Opens in a new tab
Yannic Kilcher youtube.com ai-research artificial-intelligence-and-machine-learning channel video youtube 2024-12-27 00:48

↗

https://ykilcher.com/discord Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute:...

Watch on YouTube Opens in a new tab
End of feed