• maiweb v0.1.0
  • ★
  • Feedback

Daniel Bourke

active · last success 2026-06-19 22:03

Visit site ↗ · Feed ↗

  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-03-13 08:09
    ↗

    In this talk, I go over the rise of small language models (SLMs) and how they can benefit your business or day to day life. Talk date: March 12, 2026 at the Queensland AI Meetup. We'll look at case studies such as Sunny, an iOS application which uses a fine-tuned version of...

    ▶ Watch on YouTube Opens in a new tab
    In this talk, I go over the rise of small language models (SLMs) and how they can benefit your business or day to day life. Talk date: March 12, 2026 at the Queensland AI Meetup. We'll look at case studies such as Sunny, an iOS application which uses a fine-tuned version of MedGemma to privately track skin health on-device. We'll break down why on-device inference matters (privacy, offline access, zero ongoing cost) and compare the economics of local models versus cloud API pricing at scale. Then we'll discuss the hardware and software optimizations required to run a model on a compute constrained device. Finally, we'll get hands-on and fine-tune a small language model live. We'll walk through how to build a custom dataset, set up supervised fine-tuning using Hugging Face's SFT Trainer, and fine-tune a small model, Gemma 3 270M, in about two minutes on an RTX 6000 Blackwell GPU on Google Colab. We'll see how the base model's outputs to the fine-tuned version side by side, showing how even a small model can be customized to know specific people, handle edge cases, and refuse to answer questions it shouldn't. Links: Colab Notebook we used - https://dbourke.link/qldai-colab-notebook-llm-fine-tune-march-2026 End-to-end Small Language Model Fine-tuning Tutorial - https://www.learnhuggingface.com/notebooks/hugging_face_llm_full_fine_tune_tutorial Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - https://www.linkedin.com/in/mrdbourke/ Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com Timestamps: 0:00 - Intro 2:19 - About me 4:09 - Case study: Sunny 7:55 - Benefits of small language models running on-device 8:44 - Cost savings of on-device models 9:29 - Case study: Sunny (hardware overview) 10:55 - Current best practice for running VLMs on iPhone 12:35 - Case study: Sunny (memory usage in Xcode) 13:56 - Case study: Sunny (workflow overview) 15:25 - Jeff Dean on precision 16:46 - Precision breakdown 17:28 - Effects of quantization on model size footprint 17:48 - Saving memory by reducing token usage 19:28 - Before and after different on-device experiments 20:29 - Case studies for other small but useful language models 24:00 - Case study for private VLM-based surveillance 25:04 - Small language models features and benefits 25:47 - How to pick a model for your use case 26:06 - Question: What hardware is required for getting started? 27:21 - Prompting vs fine-tuning vs RAG 28:08 - Live LLM fine-tuning problem overview 28:50 - How I made a dataset for fine-tuning Gemma 3 33:08 - Live fine-tuning code begins in Google Colab 36:56 - Data = a guide for what you want your model to do 40:37 - Question: How do you know if your fine-tuned model is performing well? 44:24 - Comparing the base model to the fine-tuned model 53:23 - Demo'ing our fine-tuned model on Hugging Face Spaces 56:00 - Haiku 57:33 - Contact me 59:05 - Q&A
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-02-24 02:14
    ↗

    Sunny is an iOS application designed for skin health tracking. It uses a fine-tuned version of MedGemma to extract structured visual data from skin images. To ensure privacy, the model runs completely on-device. Sunny's goal is to help more people performed structured...

    ▶ Watch on YouTube Opens in a new tab
    Sunny is an iOS application designed for skin health tracking. It uses a fine-tuned version of MedGemma to extract structured visual data from skin images. To ensure privacy, the model runs completely on-device. Sunny's goal is to help more people performed structured self-skin examinations and in turn catch potential skin cancers earlier and get them treated earlier. This video is part of a submission by Daniel and Josh Bourke to the MedGemma Impact Challenge on Kaggle. Writeup on Kaggle - https://kaggle.com/competitions/med-gemma-impact-challenge/writeups/sunny-private-skin-health-tracker Try Sunny via Apple's TestFlight - https://testflight.apple.com/join/HeCwNNGA Code on GitHub - https://github.com/mrdbourke/sunny/ Fine-tuned MedGemma Model (Sunny-MedGemma) - https://huggingface.co/mrdbourke/sunny-medgemma-1.5-4b-finetune-mlx-4bit Dataset used for fine-tuning - https://huggingface.co/datasets/mrdbourke/sunny-skin-and-sunscreen-extract-1k Disclaimer: Sunny is not a diagnostic tool. It is for tracking purposes only. AI generations by Sunny-MedGemma may contain mistakes, check for accuracy. Timestamps: 0:00 - Intro 0:10 - Problem Statement 0:40 - Sunny Overview 1:10 - Sunny's Goals 1:26 - Skin check story time 1:44 - Sunny technical details and usage demo 2:30 - Outro
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-02-06 02:41
    ↗

    This video compares the NVIDIA DGX Spark and NVIDIA RTX 4090 across several benchmarks and attributes such as price and footprint. Benchmark code - https://github.com/mrdbourke/gpu-benchmarking Exolabs blog post - https://blog.exolabs.net/nvidia-dgx-spark/ DGX Spark -...

    ▶ Watch on YouTube Opens in a new tab
    This video compares the NVIDIA DGX Spark and NVIDIA RTX 4090 across several benchmarks and attributes such as price and footprint. Benchmark code - https://github.com/mrdbourke/gpu-benchmarking Exolabs blog post - https://blog.exolabs.net/nvidia-dgx-spark/ DGX Spark - https://nvda.ws/4iQXZU4 Deep learning PC specs - https://pcpartpicker.com/user/mrdbourke/saved/#view=sJCnGX Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - https://www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com Timestamps: 0:50 - RTX 4090 vs DGX Spark Specs 1:57 - Quick Summary (note: spoilers) 2:39 - Footprint and Size 4:02 - Benchmarks Overview 4:43 - Results 4:48 - vLLM Input and Output Speed 5:06 - llama.cpp Input and Output Speed 6:32 - Image Generation Results 7:08 - LLM Fine-tuning 8:42 - Object Detection Model Fine-tuning 9:11 - What should you get?
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-02-04 17:03
    ↗

    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face -...

    ▶ Watch on YouTube Opens in a new tab
    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face - https://www.learnhuggingface.com Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-02-03 17:08
    ↗

    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face -...

    ▶ Watch on YouTube Opens in a new tab
    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face - https://www.learnhuggingface.com Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-29 18:45
    ↗

    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face -...

    ▶ Watch on YouTube Opens in a new tab
    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face - https://www.learnhuggingface.com Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-28 18:42
    ↗

    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face -...

    ▶ Watch on YouTube Opens in a new tab
    The NVIDIA DGX Spark and the NVIDIA RTX 4090 GPU are outstanding pieces of hardware in their own right. Let's compare them and see how they perform on a series of tests. Tutorials I make: Learn PyTorch - https://www.learnpytorch.io Learn Hugging Face - https://www.learnhuggingface.com Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-27 08:01
    ↗

    Let's build a multimodal RAG (Retrieval Augmented Generation) pipeline using NVIDIA's Nemotron embedding and rerank vision-language models. Multimodal means we'll be able to embed images and text in the same feature space. This allows us to search over images and text...

    ▶ Watch on YouTube Opens in a new tab
    Let's build a multimodal RAG (Retrieval Augmented Generation) pipeline using NVIDIA's Nemotron embedding and rerank vision-language models. Multimodal means we'll be able to embed images and text in the same feature space. This allows us to search over images and text simultaneously. We'll learn how to create multimodal embeddings, retrieve them with a query, rerank them if necessary and generate an output based on the retrieved samples. This is a scalable workflow you could take to many different use cases. If you've got a dataset of documents you need to search over, multimodal RAG could be part of the solution. All of this was performed locally on a NVIDIA DGX Spark (see here for more: https://nvda.ws/4iQXZU4). Businesses: If you're a business who needs help creating their own multimodal RAG pipeline, contact me at: https://www.mrdbourke.com/contact/ Links: Source code (book version) - https://www.learnhuggingface.com/notebooks/hugging_face_multimodal_rag_tutorial Source code (GitHub) - https://github.com/mrdbourke/learn-huggingface/blob/main/notebooks/hugging_face_multimodal_rag_tutorial.ipynb Source code (Colab) - https://colab.research.google.com/drive/1viw_XNE9OKrRevKa_4PIwY0sw6ZHgqSF?usp=sharing YouTube playlist of livestreams - https://www.youtube.com/playlist?list=PL6vjgQ2-qJFe3cv0PkIQKgbpWR-aQlm4t Resources: Nemotron RAG models - https://huggingface.co/collections/nvidia/nemotron-rag A Realistic RAG System by Martin Fowler - https://martinfowler.com/articles/gen-ai-patterns/#PuttingTogetherARealisticRag Timestamps: 0:00 - Intro and overview 1:42 - What is RAG? 2:29 - RAG vs Fine-tuning 3:25 - A realistic RAG setup 4:15 - What we're going to build 8:35 - Ingredients and tools 9:15 - What are embeddings? (Part 1) 12:07 - What are embeddings? (Part 2 - a helpful resource) 12:39 - Step: Creating the embeddings 15:08 - Step: Retrieving results given a query 21:17 - Step: Reranking retrieved results 23:46 - Code Starts 25:05 - Viewing samples in our dataset 26:29 - Loading models from a specific checkpoint on Hugging Face 28:12 - Creating/loading embeddings 30:22 - Looking at example embeddings 31:00 - Always embed your query with the same model as your documents34:07 - Viewing results of matching a query to document embeddings 36:38 - Using an image as a query 39:19 - Step: Reranking outputs 41:01 - Discussing reranking options 45:20 - Visualizing reranked samples versus the original retrieved results47:54 - Step: Loading a generation model 49:52 - Generating a summary of input recipes 50:28 - Creating a demo (locally) 1:00:34 - Uploading our demo to Hugging Face 1:01:58 - Discussing tidbits, notes and extensions
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-22 19:08
    ↗

    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow -...

    ▶ Watch on YouTube Opens in a new tab
    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-21 19:17
    ↗

    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow -...

    ▶ Watch on YouTube Opens in a new tab
    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-21 13:25
    ↗

    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow -...

    ▶ Watch on YouTube Opens in a new tab
    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-20 04:26
    ↗

    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow -...

    ▶ Watch on YouTube Opens in a new tab
    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-19 05:25
    ↗

    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow -...

    ▶ Watch on YouTube Opens in a new tab
    Let's build a Multimodal RAG setup on the NVIDIA DGX Spark using NVIDIA Nemotron Models. Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-16 19:38
    ↗

    In this video we fine-tune Hugging Face's SmolVLM2-500M Vision Language Model do structured data extraction from images. Because the SmolVLM2-500M model is is quite small in world of LLMs/VLMs, we're able to do all of the training locally on a NVIDIA DGX Spark (see here for...

    ▶ Watch on YouTube Opens in a new tab
    In this video we fine-tune Hugging Face's SmolVLM2-500M Vision Language Model do structured data extraction from images. Because the SmolVLM2-500M model is is quite small in world of LLMs/VLMs, we're able to do all of the training locally on a NVIDIA DGX Spark (see here for more: https://nvda.ws/4iQXZU4). The code should also run in Google Colab. If you have any issues, please let me know in a comment. Links: Google Colab Notebook - https://colab.research.google.com/drive/1yOwjCGZSq2jB4YLF0O0na2rEuqB6Nh7m?usp=sharing GitHub - https://github.com/mrdbourke/learn-huggingface/blob/main/notebooks/hugging_face_vlm_fine_tune_tutorial.ipynb Learn Hugging Face Book Version - https://www.learnhuggingface.com/notebooks/hugging_face_vlm_fine_tune_tutorial Dataset - https://huggingface.co/datasets/mrdbourke/FoodExtract-1k-Vision Base model (SmolVLM2-500M) - https://huggingface.co/HuggingFaceTB/SmolVLM2-500M-Video-Instruct Demo - https://huggingface.co/spaces/mrdbourke/FoodExtract-Vision-v1 Livestreams (where I build this project from scratch): Part 1: Creating a VLM dataset - https://www.youtube.com/live/cZVU559BLLM?si=jyI9pWXxkmXMO9qq Part 2: Fine-tuning a VLM with LoRA and QLoRA and getting many errors (mostly my fault) - https://www.youtube.com/live/Lgcp9hBqWEM?si=gA_7exPeqIwcxRYj Part 3: Switching from using LoRA and QLoRA (we’ll do these in a future video) to fine-tuning a smaller model (SmolVLM2) successfully, uploading it to the Hugging Face Hub and then creating an publishing a demo - https://youtube.com/live/cZVU559BLLM?feature=share Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Connect elsewhere: Download Nutrify (my startup) - https://apple.co/4ahM7Wc My website - https://www.mrdbourke.com X/Twitter - https://www.twitter.com/mrdbourke LinkedIn - www.linkedin.com/in/mrdbourke Get email updates on my work - https://dbourke.link/newsletter Read my novel Charlie Walks - https://www.charliewalks.com Timestamps: 00:00:00 - Introduction 00:02:19 - What is a VLM? 00:03:45 - Why fine-tune your own model? 00:06:05 - LLM fine-tuning mindset 00:06:51 - Case study: Nutrify 00:09:16 - Case study: Invoice Extractor 00:11:06 - Ingredients and tools we're going to use 00:12:16 - Exploring the FoodExtract-1k-Vision dataset 00:15:52 - My setup 00:16:13 - Dataset formatting for VLMs 00:16:54 - Dataset Creation for VLMs 00:17:20 - Getting a model to fine-tune 00:18:13 - Our task overview (structured data extraction from images) 00:20:11 - What we're going to end up with 00:22:38 - Code Starts 00:23:31 - Viewing a single data sample 00:29:08 - Splitting our data into train and test sets 00:34:25 - Inspecting our model's architecture 00:40:03 - Reading the recipe of the SmolDocling paper 00:45:29 - Freezing the vision encoder in our model 00:47:34 - Discussing batch sizes 00:49:06 - Setting up SFTConfig 00:52:03 - Training our model with SFTTrainer 00:54:11 - Model training starts 00:54:19 - Model training finishes 00:56:13 - Inspecting our model's loss curves 00:57:10 - Uploading our trained model to Hugging Face 00:58:19 - Model uploading to Hugging Face begins 00:58:26 - Model uploading finishes 00:59:38 - Comparing the base model to the fine-tuned model 01:01:06 - Viewing our fine-tuned model's first predictions 01:03:35 - Creating a demo with Gradio 01:06:46 - Uploading our demo to the Hugging Face Hub 01:07:35 - Trying out our demo 01:08:27 - What's next and extensions
  • Daniel Bourke youtube.com channel machine-learning video youtube 2026-01-15 21:17
    ↗

    Let's work towards fine-tuning a VLM on the NVIDIA DGX Spark and turn it into a reusable demo.

    ▶ Watch on YouTube Opens in a new tab
    Let's work towards fine-tuning a VLM on the NVIDIA DGX Spark and turn it into a reusable demo.
  • End of feed
Maibook — your private personalized AI community
  • rcanand.com
  • mlaillc.com
  • @rcanand (X)
  • LinkedIn
  • Feedback
  • Credits