• maiweb v0.1.0
  • ★
  • Feedback

Jay Alammar

active · last success 2026-06-19 22:49

Visit site ↗ · Feed ↗

  • Jay Alammar youtube.com channel machine-learning video youtube 2025-02-10 18:46
    ↗

    Enroll for free now: https://bit.ly/4aRnn7Z Github Repo: https://github.com/HandsOnLLM/Hands-On-Large-Language-Models We're ecstatic to bring you "How Transformer LLMs Work" -- a free course with ~90 minutes of video, code, and crisp visuals and animations that explain the...

    ▶ Watch on YouTube Opens in a new tab
    Enroll for free now: https://bit.ly/4aRnn7Z Github Repo: https://github.com/HandsOnLLM/Hands-On-Large-Language-Models We're ecstatic to bring you "How Transformer LLMs Work" -- a free course with ~90 minutes of video, code, and crisp visuals and animations that explain the modern Transformer architecture, tokenizers, embeddings, and mixture-of-expert models. @MaartenGrootendorst and I have developed a lot of the visual language over the last several years (tens of thousands of iterations for hundreds of figures) for the book. This was informed by many incredible colleagues at Cohere, C4AI, and the open source and open science ML community. But to have an opportunity to collaborate with the legendary Andrew Ng and the team at @Deeplearningai we took them to the next level with animations and a concise narrative meant to enable technical learners to pick up an ML paper and understand the architecture description. In this course, you'll learn how a transformer network architecture that powers LLMs works. You'll build the intuition of how LLMs process text and work with code examples that illustrate the key components of the transformer architecture. Key topics covered in this course include: The evolution of how language has been represented numerically, from the Bag-of-Words model through Word2Vec embeddings to the transformer architecture that captures word meanings in full context. How LLM inputs are broken down into tokens, which represent words or pieces before they are sent to the language model. The details of a transformer and the three main stages, consisting of tokenization and embedding, the stack of transformer blocks, and the language model head. The details of the transformer block, including attention, which calculates relevance scores followed by the feedforward layer, which incorporates stored information learned in training. How cached calculations make transformers faster, how the transformer block has evolved over the years since the original paper was released, and how they continue to be widely used. Explore an implementation of recent models in the Hugging Face transformer library. By the end of this course, you’ll have a deep understanding of how LLMs process language and you'll be able to read through papers describing models and understand the details that are used to describe these architectures. This intuition will help improve your approach to building LLM applications.
  • Jay Alammar youtube.com channel machine-learning video youtube 2025-01-15 14:06
    ↗

    Alphaxiv is an awesome way to discuss ML papers -- often with the authors themselves. Here's an intro and demo by Raj Palleti we shot at Neurips2024.

    ▶ Watch on YouTube Opens in a new tab
    Alphaxiv is an awesome way to discuss ML papers -- often with the authors themselves. Here's an intro and demo by Raj Palleti we shot at Neurips2024.
  • Jay Alammar youtube.com channel machine-learning video youtube 2025-01-13 14:23
    ↗

    The SWE-bench task measures AI agents on software engineering tasks at the level of a github issue. It was one of the most important tasks measuring the progress of agents tackling software engineering tasks in 2024. We caught up with two of its creators, Ofir Press and...

    ▶ Watch on YouTube Opens in a new tab
    The SWE-bench task measures AI agents on software engineering tasks at the level of a github issue. It was one of the most important tasks measuring the progress of agents tackling software engineering tasks in 2024. We caught up with two of its creators, Ofir Press and Carlos E. Jimenez, to share their ideas on the state of LLM-backed agents. --- Check out our book: https://www.llm-book.com/ Mailing List: https://newsletter.languagemodels.co/ Bluesky: https://bsky.app/profile/jayalammar.bsky.social Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/
  • Jay Alammar youtube.com channel machine-learning video youtube 2024-04-17 17:58
    ↗

    Tool use is a method whichs allows developers to connect Cohere's Command models to external tools like search engines, APIs, databases, and other software tools. Just like how Retrieval-Augmented Generation (RAG) allows a model to use an external data source to improve...

    ▶ Watch on YouTube Opens in a new tab
    Tool use is a method whichs allows developers to connect Cohere's Command models to external tools like search engines, APIs, databases, and other software tools. Just like how Retrieval-Augmented Generation (RAG) allows a model to use an external data source to improve factual generation, tool use is a capability that allows retrieving data from multiple sources. But it goes beyond simply retrieving information and is able to use software tools to execute code, or even create entries in a CRM system. In this video, we'll see how we can use two tools to create a simple data analyst agent that is able to search the web and run code in a python interpreter. This agent uses Cohere's Command R+ mode and Langchain. Find the code here at Colab: https://colab.research.google.com/github/cohere-ai/notebooks/blob/main/notebooks/agents/Data_Analyst_Agent_Cohere_and_Langchain.ipynb Github: https://github.com/cohere-ai/notebooks/blob/main/notebooks/agents/Data_Analyst_Agent_Cohere_and_Langchain.ipynb LangChain Tools: https://python.langchain.com/docs/integrations/tools/
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-09-21 17:07
    ↗

    Tokenizers are one of the key components of Large Language Models (LLMs). One of the best ways to understand what they do, is to compare the behavior of different tokenizers. In this video, Jay takes a carefully crafted piece of text (that contains English, code, indentation,...

    ▶ Watch on YouTube Opens in a new tab
    Tokenizers are one of the key components of Large Language Models (LLMs). One of the best ways to understand what they do, is to compare the behavior of different tokenizers. In this video, Jay takes a carefully crafted piece of text (that contains English, code, indentation, numbers, emoji, and other languages) and passes it through different trained tokenizers to reveal what they succeed and fail at encoding, and the different design choices for different tokenizers and what they say about their respective models. --- Contents: 0:00 Introduction 1:25 The carefully polished text to test tokenizers 2:19 BERT Uncased 3:59 BERT Cased 4:29 GPT-2 6:00 FLAN-T5 7:00 GPT-4 9:24 Starcoder 21:31 Galactica --- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ Access the Early Release version of the book with a 30-day free trial of the O'Reilly learning platform: https://learning.oreilly.com/get-learning/?code=HOLLM23 [The formatting for the tokenization chapter is still a work-in-progress, but the video gives you a better look at the approach]
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-07-26 12:52
    ↗

    Despite processing internet-scale text data, large language models never see words as we do. Yes, they consume text, but another piece of software called a tokenizer is what actually takes in the text and translates it into a different format that the language model actually...

    ▶ Watch on YouTube Opens in a new tab
    Despite processing internet-scale text data, large language models never see words as we do. Yes, they consume text, but another piece of software called a tokenizer is what actually takes in the text and translates it into a different format that the language model actually operates on. In this video, Jay goes examines a language model tokenizer to give you a sense of how they work. Follow our upcoming book, Hands-On Large Language Models, for more details about tokenizers and LLMs in general. Updates on the book coming on https://jayalammar.substack.com/ My co-author: https://twitter.com/MaartenGr / https://maartengrootendorst.substack.com/ Early access on https://www.oreilly.com/library/view/hands-on-large-language/9781098150952/ --- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ --- 0:00 Introduction 0:41 We're writing: Hands-On Large Language Models 1:13 Generating text with ChatGPT Cohere Command 2:42 Looking at the generation code 5:03 What is the actual input to a language model? 7:14 What is the actual output of a language model generate? 7:50 The tokenizer's lookup table and embeddings inside a model 9:07 Looking at the model, tokenizer 12:27 Summary
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-06-14 07:32
    ↗

    ▶ Watch on YouTube Opens in a new tab

    No full content extracted yet.

    Extracting…
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-06-08 08:01
    ↗

    ▶ Watch on YouTube Opens in a new tab

    No full content extracted yet.

    Extracting…
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-06-05 13:32
    ↗

    ▶ Watch on YouTube Opens in a new tab

    No full content extracted yet.

    Extracting…
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-05-31 15:01
    ↗

    ▶ Watch on YouTube Opens in a new tab

    No full content extracted yet.

    Extracting…
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-05-25 14:20
    ↗

    Tools like Langchain (https://github.com/hwchase17/langchain/tree/master/langchain) help you build applications on top of large language models. #shorts

    ▶ Watch on YouTube Opens in a new tab
    Tools like Langchain (https://github.com/hwchase17/langchain/tree/master/langchain) help you build applications on top of large language models. #shorts
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-05-08 13:20
    ↗

    Over a decade ago, the phrase “software is eating the world” described how software was rapidly becoming the center of many industries beyond the technology sector. The leading book retailers, video services providers, music companies, entertainment companies, and even movie...

    ▶ Watch on YouTube Opens in a new tab
    Over a decade ago, the phrase “software is eating the world” described how software was rapidly becoming the center of many industries beyond the technology sector. The leading book retailers, video services providers, music companies, entertainment companies, and even movie production companies were essentially software companies. That trend is still going strong. In this video, Jay shares observations on the value in the AI technology stack and focuses on where some of the technical moats might be. The previous video in this series (https://www.youtube.com/watch?v=AeW9r3lopp0) discussed 4 major points about useful perspectives on generative AI. Here we continue the series with points 5-8. Blog post: https://txt.cohere.com/ai-is-eating-the-world/ --- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ -- 0:00 Introduction 3:44 5) Maps and Landscapes of AI Technology and Value Stacks 7:16 6) Enterprises: Plan Not for One, but Thousands of AI Touchpoints in Your Systems 8:46 7) Account for the Many Descendants and Iterations of a Foundation Model 16:01 8) Model Usage Datasets Allow Collective Exploration of a Model’s Generative Space
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-04-16 13:40
    ↗

    What's the big deal with Generative AI? Is it the future or the present? In this video, Jay goes over four key reflections on how best to think of the current state of AI products and features, and avoid pitfalls people tend to make with new tech. Blog post:...

    ▶ Watch on YouTube Opens in a new tab
    What's the big deal with Generative AI? Is it the future or the present? In this video, Jay goes over four key reflections on how best to think of the current state of AI products and features, and avoid pitfalls people tend to make with new tech. Blog post: https://txt.cohere.ai/generative-ai-future-or-present/ What is Neural Search? Nils Reimers - Sentence Transformers and Embedding Evaluation https://www.youtube.com/watch?v=Z_4rohX4Ki8&ab_channel=Cohere --- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ -- 0:00 Introduction 1:04 1- Recent AI developments are awe-inspiring and promise to change the world. But when? 5:47 2- Make a distinction between impressive 🍒 cherry-picked demos, and reliable use cases that are ready for the marketplace 6:35 3- Think of models as components of intelligent systems, not minds 7:56 4- Generative AI alone is only the tip of the iceberg
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-01-16 13:29
    ↗

    Learn how AI image generation works. This video goes over the AI components of AI image generation models like Stable Diffusion and explains how they work and how they're trained. Blog post: https://jalammar.github.io/illustrated-stable-diffusion/ --- Twitter:...

    ▶ Watch on YouTube Opens in a new tab
    Learn how AI image generation works. This video goes over the AI components of AI image generation models like Stable Diffusion and explains how they work and how they're trained. Blog post: https://jalammar.github.io/illustrated-stable-diffusion/ --- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ -- Introduction (0:00) Text-to-image and image-to-image (1:32) The components of Stable Diffusion - high-level overview (3:06) The three models inside the AI Image Generator (5:48) Generating images with reverse diffusion (8:36) Images emerging from noise (11:09) How the model is trained. 1 - Diffusion (12:46) How the model is trained. 2 - Compression (17:44) The importance of language models for image generation (20:43) How CLIP is trained (training on both text and images) (22:55) Guiding image generation with text prompts (25:57) Conclusion (28:07)
  • Jay Alammar youtube.com channel machine-learning video youtube 2023-01-01 15:59
    ↗

    This is a version of the intro cinematic to an old video game (Nemesis 2 on the MSX system). The graphics are remade using AI Generated images. Read more about the process in: https://jalammar.github.io/ai-image-generation-tools/ Original:...

    ▶ Watch on YouTube Opens in a new tab
    This is a version of the intro cinematic to an old video game (Nemesis 2 on the MSX system). The graphics are remade using AI Generated images. Read more about the process in: https://jalammar.github.io/ai-image-generation-tools/ Original: https://www.youtube.com/watch?v=nWdTIHpUORE&ab_channel=Zebpro
  • End of feed
Maibook — your private personalized AI community
  • rcanand.com
  • mlaillc.com
  • @rcanand (X)
  • LinkedIn
  • Feedback
  • Credits