There is a new monthly release of Positron available, and it includes a total revamp of the integration between GitHub Copilot and Positron Assistant, Positron’s AI coding assistant built specifically for data science workflows. You can find a few more details, including how...
There is a new monthly release of Positron available, and it includes a total revamp of the integration between GitHub Copilot and Positron Assistant, Positron’s AI coding assistant built specifically for data science workflows. You can find a few more details, including how to install Positron, on my blog: https://juliasilge.com/blog/copilot-african-languages
There is a new monthly release of Positron available, and it delivers some fresh new features for the Data Explorer, Positron’s interactive data viewer with support for files like CSV and Parquet as well as objects in your R or Python session. You can find a few more details,...
There is a new monthly release of Positron available, and it delivers some fresh new features for the Data Explorer, Positron’s interactive data viewer with support for files like CSV and Parquet as well as objects in your R or Python session. You can find a few more details, including how to install Positron, on my blog: https://juliasilge.com/blog/literary-prizes
Recently I’ve been spending my time working on Positron, a new, next generation data science IDE for Positron. Positron is still in a public beta period and may not yet be a good fit for everybody, but I’ve been using it for R package development lately and wanted to walk...
Recently I’ve been spending my time working on Positron, a new, next generation data science IDE for Positron. Positron is still in a public beta period and may not yet be a good fit for everybody, but I’ve been using it for R package development lately and wanted to walk through what that looks like. Watch along as I release a new version of an R package to CRAN. You can find a few more details, including how to install Positron, on my blog: https://juliasilge.com/blog/r-pkg-release
It’s been a while since I published a screencast, because I have been focused on a new, next generation data science IDE for Positron. Check out a first look at Positron in this video, using this week’s #TidyTuesday dataset on orca encounters. You can find a few more details,...
It’s been a while since I published a screencast, because I have been focused on a new, next generation data science IDE for Positron. Check out a first look at Positron in this video, using this week’s #TidyTuesday dataset on orca encounters. You can find a few more details, including how to install Positron, on my blog: https://juliasilge.com/blog/orcas-positron
TidyX Episode 186: Gapminder Camcorder - Be Kind Rewind Code Explanation In this episode, we start a series explaining the examples Ellis showed from his "Be Kind, Rewind" talk at posit::conf(2024). First up, we jump into creating a captivating animated visualization of the...
TidyX Episode 186: Gapminder Camcorder - Be Kind Rewind Code Explanation
In this episode, we start a series explaining the examples Ellis showed from his "Be Kind, Rewind" talk at posit::conf(2024). First up, we jump into creating a captivating animated visualization of the Gapminder dataset using R and the {camcorder} package. We break down the code step-by-step, from setting up the animation recording to customizing the plot aesthetics. Learn how to generate smooth and informative animations that tell a compelling story about global trends in GDP per capita and life expectancy.
Join us for an episode that will have your simulations running at the speed of light! ️
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/186-Gapminder_Camcorder
TidyX Episode 185: Independence Days with {purrr} Using Wikipedia's list of independence days, we'll show you have to use some advanced {purrr} to work with the data, construct new functions, and work with extracted data from webpages to transform it into usable formats. We...
TidyX Episode 185: Independence Days with {purrr}
Using Wikipedia's list of independence days, we'll show you have to use some advanced {purrr} to work with the data, construct new functions, and work with extracted data from webpages to transform it into usable formats. We aim to answer the amusing quip that every 4 days a country celebrates its independence from the UK with this dataset!
Join us for an episode that will have your simulations running at the speed of light! ️
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/185-global_independence_days
⬇️ Get the file and follow along: https://bit.ly/3y00CzM This tutorial video demonstrates how to create the best data visualizations for data science using the mighty plotnine library and Python in Excel. Now, more than ever, Python in Excel will empower millions of...
⬇️ Get the file and follow along: https://bit.ly/3y00CzM
This tutorial video demonstrates how to create the best data visualizations for data science using the mighty plotnine library and Python in Excel.
Now, more than ever, Python in Excel will empower millions of professionals to get real-world data science done - all without the hassle of a local Python installation.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:33 The Dataset
03:30 Loading and Wrangling the Dataset
06:28 A Multivariate Histogram
09:27 Adding a Facet
11:54 Make that Two Facets!
14:06 No! Make that Three Facets!
16:37 What’s Next
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 184: Hello Kitty: Intro to {purrr} This intro highlights purrr's core functionalities and different ways to write the functions, from named to anonymous functions, keeping types consistent, or even applying functions to filter and pull out contents from lists....
TidyX Episode 184: Hello Kitty: Intro to {purrr}
This intro highlights purrr's core functionalities and different ways to write the functions, from named to anonymous functions, keeping types consistent, or even applying functions to filter and pull out contents from lists. Learn the basics to understand how we can apply these techniques to more complicated structures!
Join us for an episode that will have your simulations running at the speed of light! ️
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/184-Hello_Kitty
TidyX Episode 183: Within-group regression using {purrr} Unleash the power of {purrr} to perform within-group regressions! This episode we'll explore fitting separate linear models for different groups in your data, using the Palmer Penguins dataset as an example. Using...
TidyX Episode 183: Within-group regression using {purrr}
Unleash the power of {purrr} to perform within-group regressions! This episode we'll explore fitting separate linear models for different groups in your data, using the Palmer Penguins dataset as an example. Using map(), we'll quickly build models, extract key statistics, and visualize how groups differ. Join us to start the journey on master this great package and become a {purrr}fect data scientist!
Join us for an episode that will have your simulations running at the speed of light! ️
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/183-Within_group_purrr
TidyX Episode 182: Turbocharge Your Simulations with Parallel Processing! ⚡️ Ever feel like your simulations take forever to run? This TidyX episode injects a dose of speed with parallel processing using the snowfall package! We'll revisit nested for loops for simulation,...
TidyX Episode 182: Turbocharge Your Simulations with Parallel Processing! ⚡️
Ever feel like your simulations take forever to run? This TidyX episode injects a dose of speed with parallel processing using the snowfall package! We'll revisit nested for loops for simulation, then supercharge them to run across multiple cores.
Learn how to run simulations in parallel for faster results using the snowfall package, and combine and analyze simulation outputs for deeper insights.
Join us for an episode that will have your simulations running at the speed of light! ️
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/182-Parallel_Sim
TidyX Episode 181: I Likert Coffee Calling all coffee lovers! ☕️ This episode of TidyX gets to the grounds of coffee expertise with a TidyTuesday survey. We'll brew up some data analysis to see if age affects how people rate their coffee knowledge. Get ready for Likert...
TidyX Episode 181: I Likert Coffee
Calling all coffee lovers! ☕️ This episode of TidyX gets to the grounds of coffee expertise with a TidyTuesday survey. We'll brew up some data analysis to see if age affects how people rate their coffee knowledge. Get ready for Likert scales, wrangling data, and statistical throwdowns to see which age group claims coffee crown!
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/181-I_Likert_Coffee
TidyX Episode 180: How much stuff have we sent to Space? Ever wondered how much stuff has rocketed into space? This episode we do a 180 and look at how we started TidyX by looking at a TidyTuesday dataset to explore objects launched into space! We'll learn how to wrangle the...
TidyX Episode 180: How much stuff have we sent to Space?
Ever wondered how much stuff has rocketed into space? This episode we do a 180 and look at how we started TidyX by looking at a TidyTuesday dataset to explore objects launched into space! We'll learn how to wrangle the data, calculate launch counts by year, and create visualizations with ggplot2. Plus, we'll discover a cool trick for faceting plots with independent y-axes, and finally show a fun way to interact with facets using the trelliscopejs package. Join us for a stellar exploration of space exploration data!
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/180-Going_to_Space
⬇️ Get the files and follow along: https://bit.ly/3QEoa3f Machine learning with the random forest algorithm is the perfect place to start your data science journey! This tutorial video will teach you how the random forest algorithm works and how to train random forest models...
⬇️ Get the files and follow along: https://bit.ly/3QEoa3f
Machine learning with the random forest algorithm is the perfect place to start your data science journey!
This tutorial video will teach you how the random forest algorithm works and how to train random forest models using Python in Excel.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:48 Decision Tree Variance
07:23 Real-World Decision Trees
08:47 Random Forests Are Ensembles
10:04 Manufacturing Independence
11:10 Bagging
17:21 Feature Randomization
22:01 The Dataset
24:20 Preparing the Data
28:08 Training a Random Forest
33:30 Analyzing OOB Predictions
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 179: How many SpaghettiOs does it take to write LOTR? we embark on a hilarious journey to answer the age-old question: how many SpaghettiOs would it take to write a whole book? Inspired by abstract_tyler's instagram reel...
TidyX Episode 179: How many SpaghettiOs does it take to write LOTR?
we embark on a hilarious journey to answer the age-old question: how many SpaghettiOs would it take to write a whole book? Inspired by abstract_tyler's instagram reel (https://www.instagram.com/p/C6hUeRVp24H/), the we use the power of R to find out! Prepare for some serious spaghetti-fueled fun as we delve into the world of R for data wrangling. We'll tackle skills like joining data sets, calculating frequencies, and writing functions to automate the analysis.
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/179-How_many_spaghettios
⬇️ Get the files and follow along: https://bit.ly/3QlcW3q Topic modeling with Latent Dirichlet Allocation (LDA) allows you to extract information from your text documents! It doesn't matter if you have emails, SMS text messages, Customer Service chats, or free-form fields in...
⬇️ Get the files and follow along: https://bit.ly/3QlcW3q
Topic modeling with Latent Dirichlet Allocation (LDA) allows you to extract information from your text documents!
It doesn't matter if you have emails, SMS text messages, Customer Service chats, or free-form fields in an IT system.
LDA topic modeling is useful to ANY professional.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
LDA ALGORITHM TUTORIALS
--------------------------------------------------------------------------------------------
Latent Dirichlet Allocation (Part 1 of 2):
https://www.youtube.com/watch?v=T05t-SqKArY
Training Latent Dirichlet Allocation: Gibbs Sampling (Part 2 of 2):
https://www.youtube.com/watch?v=BaM1uiCpj_E
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:10 The BBC Dataset
02:08 Processing the Text Data
06:34 Training the LDA Topic Model
09:00 Which Docs Have Which Topics?
10:15 Which Words Belong to Which Topics?
11:52 How Many Topics?
14:45 What’s Next?
#pythoninexcel #pythonexcel #pythonforexcel
⬇️ Get the files and follow along: https://bit.ly/4aLsMg7 Your boss hands you a pile of documents and asks you to do some "data magic." What do you do? Use the k-means clustering algorithm! In this video, I will teach you a powerful technique for working with text documents...
⬇️ Get the files and follow along: https://bit.ly/4aLsMg7
Your boss hands you a pile of documents and asks you to do some "data magic." What do you do? Use the k-means clustering algorithm!
In this video, I will teach you a powerful technique for working with text documents using Python in Excel:
1️⃣ Preprocess your text documents using the mighty TF-IDF.
2️⃣ Cluster the documents using k-means.
3️⃣ Use a machine learning model to help interpret the clusters.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:42 Tokenization
05:31 Document Vectors
06:40 The Naïve Bayes Algorithm
10:50 The Math of Naïve Bayes
18:10 Training the Naïve Bayes Model in Excel
24:46 Testing the Naïve Bayes Model in Excel
28:06 What’s Next?
#pythoninexcel #pythonexcel #pythonforexcel
Data science is THE BEST use case for Python in Excel! Want to have more impact at work? Do you have text data? Use Python in Excel to analyze it! Text data is in Word docs, customer service chats, notes fields in IT systems. Text data is everywhere in every organization....
Data science is THE BEST use case for Python in Excel! Want to have more impact at work? Do you have text data? Use Python in Excel to analyze it!
Text data is in Word docs, customer service chats, notes fields in IT systems. Text data is everywhere in every organization.
With Python in Excel, you can now harvest the value hidden in your text data.
I will use the Naïve Bayes machine learning technique with text documents in this video.
Naïve Bayes is commonly used in text analytics to classify documents.
A real-world scenario where Naïve Bayes has been successfully used for years is classifying emails and text messages as illegitimate (i.e., spam) or legitimate (i.e., ham).
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:42 Tokenization
05:31 Document Vectors
06:40 The Naïve Bayes Algorithm
10:50 The Math of Naïve Bayes
18:10 Training the Naïve Bayes Model in Excel
24:46 Testing the Naïve Bayes Model in Excel
28:06 What’s Next?
--------------------------------------------------------------------------------------------
GET THE MICROSOFT EXCEL WORKBOOK
--------------------------------------------------------------------------------------------
Here's the link to the GitHub for my Python in Excel video workbooks:
https://github.com/DaveOnData/PythonInExcelYouTube
NOTE - You have to have access to Python in Excel to run the code!
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 178: Player Time Chart - FIBA API Part 2 In this follow-up to Episode 177, we dive deeper into the intricacies of FIBA basketball game data. Building upon our previous exploration, we refine our methods to generate insightful player time charts. Join us as we...
TidyX Episode 178: Player Time Chart - FIBA API Part 2
In this follow-up to Episode 177, we dive deeper into the intricacies of FIBA basketball game data. Building upon our previous exploration, we refine our methods to generate insightful player time charts. Join us as we unravel the complexities of lineup analysis and visualize player dynamics over the course of a game. Get ready for another insightful episode of TidyX!"
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/178-Whos_Next_FIBA_API_Plotting
Data science is the top use case for Python in Excel! However, a fundamental question remains - can you do real data science with Python in Excel? I have tested Python in Excel's scalability with various real-world data science scenarios and report my findings in this video....
Data science is the top use case for Python in Excel! However, a fundamental question remains - can you do real data science with Python in Excel?
I have tested Python in Excel's scalability with various real-world data science scenarios and report my findings in this video.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:36 Datasets
04:46 Visual Data Analysis
07:48 Cluster Analysis
11:11 Decision Tree ML Models
14:40 Random Forest ML Models
17:21 The Verdict
#pythoninexcel #pythonexcel #pythonforexcel
In 2024, using Python in Excel for data wrangling isn't a great idea. Surprised? All will be clear when you consider how Python in Excel works behind the scenes. Data wrangling is one of the most important tasks in analytics. Estimates are 60-80% of the time required for...
In 2024, using Python in Excel for data wrangling isn't a great idea. Surprised? All will be clear when you consider how Python in Excel works behind the scenes.
Data wrangling is one of the most important tasks in analytics.
Estimates are 60-80% of the time required for analytics projects (especially advanced analytics projects) is spent on acquiring, filtering, cleaning, combining, enriching, and transforming data.
As a hands-on analytics professional, I can tell you this is no exaggeration!
Trust me. You want to use the best technology for your data wrangling.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:04 Python in Excel Architecture
02:49 Data Wrangling Options
05:52 Python in Excel Wrangling Exceptions
07:36 Python in Excel’s Future
#pythoninexcel #pythonexcel #pythonforexcel
As powerful as Microsoft Excel charts are, they can't do everything. Enter the power of Python in Excel for data visualization! Building your visual data analysis skills is a great place to start if you want to have more impact at work using data. The combination of Excel...
As powerful as Microsoft Excel charts are, they can't do everything. Enter the power of Python in Excel for data visualization!
Building your visual data analysis skills is a great place to start if you want to have more impact at work using data.
The combination of Excel charts and Python in Excel is like chocolate and peanut butter - better together.
Check out what Python in Excel makes possible - you won't regret it!
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:05 Faceting Data Visualizations
01:47 Faceted Histograms
04:31 Faceted Bar Charts
05:56 Faceted Scatter Plots
07:27 Faceted Strip Plots
08:19 Violin Plots
09:13 Faceted Violin Plots
--------------------------------------------------------------------------------------------
MY EDA TUTORIAL SERIES ON YOUTUBE
--------------------------------------------------------------------------------------------
My 7-part tutorial series will teach you exploratory data analysis (EDA) with Excel. Here's the playlist link:
https://www.youtube.com/playlist?list=PLTJTBoU5HOCRFQhfU1gg2ciNpS_evWKR7
--------------------------------------------------------------------------------------------
VIOLIN PLOT TUTORIAL
--------------------------------------------------------------------------------------------
Here's a link to good tutorial on how to read violin plots:
https://mode.com/blog/violin-plot-examples
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 177: Who's Next? FIBA API Viewer Question We tackle a real-world challenge brought to us by our viewer, Cohen MacDonald. Coehn found an undocumented API that has a bunch of game data from FIBA and has some great ideas on what to do with it. However, theres one...
TidyX Episode 177: Who's Next? FIBA API Viewer Question
We tackle a real-world challenge brought to us by our viewer, Cohen MacDonald. Coehn found an undocumented API that has a bunch of game data from FIBA and has some great ideas on what to do with it. However, theres one problem: the dataset does not contain which players are on the court at what time, just who subs in or out. With an intriguing problem statement and example code from Cohen in hand, we delve into the intricacies of FIBA basketball game data. See how we harness the power of for loops to iteratively update values, addressing Cohen's query on player substitutions and lineup analysis.
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/177-Whos_Next_FIBA_API
Using machine learning to analyze your data is THE single best use case for Python in Excel! In this video, I cover the 5 must-have machine learning techniques for Python in Excel. While I’ve been teaching professionals machine learning skills for years, I was never more...
Using machine learning to analyze your data is THE single best use case for Python in Excel! In this video, I cover the 5 must-have machine learning techniques for Python in Excel.
While I’ve been teaching professionals machine learning skills for years, I was never more excited than when I was introduced to the Python in Excel feature in August 2023.
Python in Excel is a game-changer. It makes it ridiculously easy for professionals like you to build machine learning skills that are actually useful in the real world.
Are you ready to have more impact at work using data?
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Into
00:58 Types of Machine Learning
03:24 Decision Trees
05:08 Random Forests
07:53 K-Means Clustering
11:11 Logistic Regression
13:15 Linear Regression
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 176: Are you sure? In this episode, we're comparing pitchers using the power of Random Forests and bayesian statistics to make comparisons between pitchers likelihood of making it into the Hall of Fame! We show how to make simple simulations of individual player...
TidyX Episode 176: Are you sure?
In this episode, we're comparing pitchers using the power of Random Forests and bayesian statistics to make comparisons between pitchers likelihood of making it into the Hall of Fame! We show how to make simple simulations of individual player performance and differences, and finally make a function to let you easily compare players.
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/176-Are_you_sure
Using SQL or Power Query is the best way to get high-quality data for your Python in Excel analyses. However, choosing which is best isn't straightforward. In this video, I discuss choosing between SQL or Power Query based on your situation. ☕ If you found this content useful...
Using SQL or Power Query is the best way to get high-quality data for your Python in Excel analyses. However, choosing which is best isn't straightforward. In this video, I discuss choosing between SQL or Power Query based on your situation.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:03 Power Query Introduction
06:09 SQL Introduction
11:00 Power Query Pros & Cons
14:22 SQL Pros & Cons
18:15 Which You Should Use
--------------------------------------------------------------------------------------------
MY FAVORITE POWER QUERY BOOK
--------------------------------------------------------------------------------------------
https://www.amazon.com/Collect-Combine-Transform-Business-Skills/dp/1509307958/
--------------------------------------------------------------------------------------------
MY SQL FOR EXCEL USERS TUTORIAL SERIES
--------------------------------------------------------------------------------------------
Part 1 - Overview: https://youtu.be/g1h-9VSjawY
Part 2 - Basic Tables: https://youtu.be/otzlxjKCJ38
Part 3 - Basic Filters: https://youtu.be/CLAESLlHjgk
#pythoninexcel #pythonexcel #pythonforexcel
Don't waste time! Learning Python in Excel or VBA should match your career goals. Python in Excel and VBA are powerful but address very different scenarios for Microsoft Excel users. In this video, I discuss choosing the right option to achieve your goals. ☕ If you found this...
Don't waste time! Learning Python in Excel or VBA should match your career goals. Python in Excel and VBA are powerful but address very different scenarios for Microsoft Excel users.
In this video, I discuss choosing the right option to achieve your goals.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
00:37 VBA Overview
04:16 Python in Excel Overview
07:33 The Question is Answered
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 175: Random Strike Zone Forest: Tidyverse Takes on Hall of Fame Hurlers We explored the world of data modeling using Tidyverse and Purrr to predict the next MLB Hall of Fame pitchers. Stay tuned for some fascinating insights into our modeling process! We use the...
TidyX Episode 175: Random Strike Zone Forest: Tidyverse Takes on Hall of Fame Hurlers
We explored the world of data modeling using Tidyverse and Purrr to predict the next MLB Hall of Fame pitchers. Stay tuned for some fascinating insights into our modeling process! We use the same datasets as we have the last several weeks, and apply logic and code to create, evaluate, and tune our models.
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/175-Pitchers_HOF_in_20_Random_Forest
Power Query and Python in Excel are like chocolate and peanut butter - better together! Skills with Power Query are a must in 2024 if you want to supercharge your analytics. The combination of Power Query and Python makes Excel a complete technology stack for doing data...
Power Query and Python in Excel are like chocolate and peanut butter - better together! Skills with Power Query are a must in 2024 if you want to supercharge your analytics.
The combination of Power Query and Python makes Excel a complete technology stack for doing data science.
For example, sourcing 300,000 rows from a database so that you can craft new and powerful insights using Python in Excel.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:22 Sizing the Data
05:59 Loading All the Data
08:54 Visualize the Data with a Count Plot
10:20 Visualize the Data with Histograms
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 174: AI Speed Ball: Predicting the 2024 Pitcher HOF Class in 20 Minutes We're bringing the heat with AI! Join us as we step up to the plate and predict the next MLB Hall of Fame pitchers using the power of TensorFlow and Keras. With a killer convolutional neural...
TidyX Episode 174: AI Speed Ball: Predicting the 2024 Pitcher HOF Class in 20 Minutes
We're bringing the heat with AI! Join us as we step up to the plate and predict the next MLB Hall of Fame pitchers using the power of TensorFlow and Keras. With a killer convolutional neural network in our arsenal, we're ready to knock it out of the park! We go over normalization techniques, how to set up your model, and use it to predict who should be in and who will be out! Don't miss this action-packed inning!
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/174-Pitchers_HOF_in_20_with_AI
While logistic regression analysis is super useful to ANY professional, it's traditionally been a pain in Microsoft Excel. Not anymore! Python in Excel makes performing powerful logistic regression analyses easy. Just in case you're unfamiliar with logistic regression, it's a...
While logistic regression analysis is super useful to ANY professional, it's traditionally been a pain in Microsoft Excel. Not anymore! Python in Excel makes performing powerful logistic regression analyses easy.
Just in case you're unfamiliar with logistic regression, it's a technique where you craft a predictive model where what you are trying to predict is Yes/No:
Does this patient have heart disease?
Should we approve this loan application?
Is this credit card authorization fraudulent?
Using logistic regression analysis is universal. Whether you work in healthcare, government, marketing, customer service, or finance doesn't matter.
Logistic regression with Python in Excel is a great tool for having more impact at work using data.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:03 The Data
02:23 Logistic Regression Using Solver
04:38 Loading the Data into Python
06:31 Wrangling the Data
08:21 The Logistic Regression Model
09:56 The Model Summary
11:49 Interpreting the Model
#pythoninexcel #pythonexcel #pythonforexcel
Ready to supercharge your analytics with Python in Excel? You can learn my battle-tested technique for finding new insights in your data and delighting your leaders! This technique is universal. Whether you work in healthcare, government, marketing, customer service, or...
Ready to supercharge your analytics with Python in Excel? You can learn my battle-tested technique for finding new insights in your data and delighting your leaders!
This technique is universal. Whether you work in healthcare, government, marketing, customer service, or finance doesn't matter.
Python in Excel is your gateway to having more impact at work using data.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
01:06 The Data
02:58 Loading the Data into Python
05:43 Scaling the Data
07:46 Clustering the Data with K-Means
10:29 Training the Machine Learning Model
12:10 Evaluating the ML Model
13:46 Interpreting the Clusters
--------------------------------------------------------------------------------------------
GET THE MICROSOFT EXCEL WORKBOOK
--------------------------------------------------------------------------------------------
Here's the link to the GitHub for my Python in Excel video workbooks:
https://github.com/DaveOnData/PythonInExcelYouTube
NOTE - You have to have access to Python in Excel to run the code!
#pythoninexcel #pythonexcel #pythonforexcel
Cluster analysis using Python in Excel is valuable to ANY professional wanting to have more impact at work using data! Python in Excel is a game-changer for professionals who want to delight stakeholders with insights that are just not possible with out-of-the-box Microsoft...
Cluster analysis using Python in Excel is valuable to ANY professional wanting to have more impact at work using data!
Python in Excel is a game-changer for professionals who want to delight stakeholders with insights that are just not possible with out-of-the-box Microsoft Excel features.
Cluster analysis with Python in Excel is a perfect example.
☕ If you found this content useful and would like to support the channel, you can buy me a coffee: https://www.buymeacoffee.com/DaveOnData
--------------------------------------------------------------------------------------------
VIDEO CHAPTERS
--------------------------------------------------------------------------------------------
00:00 Intro
00:47 The Data
02:44 Loading Data into Python
05:13 Scaling the Data
07:48 Clustering with K-means
10:28 Output the DataFrame
11:17 Analyzing the Clusters
--------------------------------------------------------------------------------------------
GET THE MICROSOFT EXCEL WORKBOOK
--------------------------------------------------------------------------------------------
Here's the link to the GitHub for my Python in Excel video workbooks:
https://github.com/DaveOnData/PythonInExcelYouTube
NOTE - You have to have access to Python in Excel to run the code!
#pythoninexcel #pythonexcel #pythonforexcel
TidyX Episode 173: Pitch into the Bayes - "20" minute MLB Hall of Fame Pitchers predictions Step up to the mound in TidyX Episode 173 as we predict MLB Hall of Fame pitchers using the power of Bayesian models! Join us as we switch up our game plan, leaving no curveball...
TidyX Episode 173: Pitch into the Bayes - "20" minute MLB Hall of Fame Pitchers predictions
Step up to the mound in TidyX Episode 173 as we predict MLB Hall of Fame pitchers using the power of Bayesian models! Join us as we switch up our game plan, leaving no curveball unturned with rstanarm. We inspect the models and results with prediction intervals and probabilities, bringing a new dimension to player forecasts! We show how to apply this to new players, from randomly selected to a Seattle Favorite - King Felix.
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/173-Pitcher_HOF_in_20_with_bayes
TidyX Episode 172: 20 minutes to Predict MLB HOF Pitchers - Class of 2024 Join us as we look into the numbers behind predicting MLB Hall of Fame pitchers! This episode includes crafting a dataset from the {Lahman} package, creating logistic regression models, and finally...
TidyX Episode 172: 20 minutes to Predict MLB HOF Pitchers - Class of 2024
Join us as we look into the numbers behind predicting MLB Hall of Fame pitchers! This episode includes crafting a dataset from the {Lahman} package, creating logistic regression models, and finally assessing them via model summary tools and visualizing techniques. Stay tuned for insights and adjustments as we navigate the challenges of forecasting HOF greatness!
Like, Subscribe, and find us on social media! (@ellis_hughes, @OSPpatrick, @tidy_explained).
If you like what we are doing, please sign up to be a patron on Patreon!
https://www.patreon.com/Tidy_Explained
Email us with any comments, questions, or suggestions at tidy.explained@gmail.com.
Links:
Open an issue on the TidyX Github page!
https://github.com/thebioengineer/TidyX/issues
Patreon:
https://www.patreon.com/Tidy_Explained
TidyX Code:
https://github.com/thebioengineer/TidyX/tree/master/TidyTuesday_Explained/172-Pitcher_HOF_in_20
We observed Martin Luther King Day in the US this week and this week’s #TidyTuesday dataset focuses on polling places in honor of King’s work on voting rights. In this screencast, let’s use summarization and visualization to understand how the numbers of polling places in the...
We observed Martin Luther King Day in the US this week and this week’s #TidyTuesday dataset focuses on polling places in honor of King’s work on voting rights. In this screencast, let’s use summarization and visualization to understand how the numbers of polling places in the US have changed. Check out the code on my blog: https://juliasilge.com/blog/polling-places
This week’s #TidyTuesday dataset is all about Doctor Who, celebrating the upcoming new episodes, and this screencast walks through how to use empirical Bayes to estimate ratings for different episode writers. The great thing about empirical Bayes is we can take into account...
This week’s #TidyTuesday dataset is all about Doctor Who, celebrating the upcoming new episodes, and this screencast walks through how to use empirical Bayes to estimate ratings for different episode writers. The great thing about empirical Bayes is we can take into account the number of episodes each writer wrote.
Check out the code on my blog: https://juliasilge.com/blog/doctor-who-bayes
Today is Election Day and this week’s #TidyTuesday dataset is about elections for the US House of Representatives. This screencast demonstrates how to use logistic regression to understand vote share in these elections, highlighting how to use visualization for model...
Today is Election Day and this week’s #TidyTuesday dataset is about elections for the US House of Representatives. This screencast demonstrates how to use logistic regression to understand vote share in these elections, highlighting how to use visualization for model interpretability and a matrix syntax for your model’s outcome (a good fit when you have proportion data). Check out the code on my blog: https://juliasilge.com/blog/house-elections
Last week’s #TidyTuesday dataset was about the songs of Taylor Swift, and this screencast demonstrates how to use topic modeling to learn how the text content of Taylor Swift’s work has changed through all her musical eras. Check out the code on my blog:...
Last week’s #TidyTuesday dataset was about the songs of Taylor Swift, and this screencast demonstrates how to use topic modeling to learn how the text content of Taylor Swift’s work has changed through all her musical eras. Check out the code on my blog: https://juliasilge.com/blog/taylor-swift
It’s getting to be spooky season and this week’s #TidyTuesday dataset is about haunted locations in the United States. In this screencast, let’s use log odds ratios weighted via empirical Bayes to understand which US states are more likely to have haunted cemeteries and which...
It’s getting to be spooky season and this week’s #TidyTuesday dataset is about haunted locations in the United States. In this screencast, let’s use log odds ratios weighted via empirical Bayes to understand which US states are more likely to have haunted cemeteries and which are more likely to have haunted schools. Check out the code on my blog: https://juliasilge.com/blog/haunted-places
He's here, he's there, he's every f*cking where, and in this screencast, we use Poisson regression and bootstrap resampling to find confidence intervals for when Roy Kent uses colorful language more or less on the TV show Ted Lasso. This #TidyTuesday dataset was created by...
He's here, he's there, he's every f*cking where, and in this screencast, we use Poisson regression and bootstrap resampling to find confidence intervals for when Roy Kent uses colorful language more or less on the TV show Ted Lasso. This #TidyTuesday dataset was created by Deepsha Menghani for her recent talk at posit::conf. Check out the code on my blog: https://juliasilge.com/blog/roy-kent
In this screencast, we use tidymodels workflowsets to try out multiple modeling approaches for a #TidyTuesday dataset on spam email. We finish off with how to create a deployable model object and set up an API for our model. Check out the code on my blog:...
In this screencast, we use tidymodels workflowsets to try out multiple modeling approaches for a #TidyTuesday dataset on spam email. We finish off with how to create a deployable model object and set up an API for our model. Check out the code on my blog: https://juliasilge.com/blog/spam-email
This week’s #TidyTuesday is about detecting output from GPT language models, and specifically how these detectors perform differently for native and non-native English writers. In this screencast, learn how to evaluate classification models with tidymodels, using either...
This week’s #TidyTuesday is about detecting output from GPT language models, and specifically how these detectors perform differently for native and non-native English writers. In this screencast, learn how to evaluate classification models with tidymodels, using either predicted classes or predicted probabilities . Check out the code on my blog: https://juliasilge.com/blog/gpt-detectors
A recent #TidyTuesday makes available geographical place names in the US, and we can explore these names as text data. In this screencast, learn how to use byte pair encoding tokenization together with Poisson regression to find out which kinds of names are used more often...
A recent #TidyTuesday makes available geographical place names in the US, and we can explore these names as text data. In this screencast, learn how to use byte pair encoding tokenization together with Poisson regression to find out which kinds of names are used more often and which are used less often. Check out the code on my blog: https://juliasilge.com/blog/place-names
This week’s #TidyTuesday is about tornadoes in the US, and it provides a great opportunity to think about how we formulate a modeling approach in challenging circumstances. In this screencast, learn how to use xgboost with racing and effect encodings to predict the magnitude...
This week’s #TidyTuesday is about tornadoes in the US, and it provides a great opportunity to think about how we formulate a modeling approach in challenging circumstances. In this screencast, learn how to use xgboost with racing and effect encodings to predict the magnitude of tornadoes. Check out the code on my blog: https://juliasilge.com/blog/tornadoes
Mothers Day is coming up this weekend and this week’s #TidyTuesday is about childcare costs in the US. In this screencast, learn how to use xgboost with early stopping to predict the cost of childcare from other characteristics of a county like demographics and women’s...
Mothers Day is coming up this weekend and this week’s #TidyTuesday is about childcare costs in the US. In this screencast, learn how to use xgboost with early stopping to predict the cost of childcare from other characteristics of a county like demographics and women’s earnings. Check out the code on my blog: https://juliasilge.com/blog/childcare-costs
I'll analyze a dataset about NYC elevators, without looking at the dataset in advance. The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-12-06 Code: https://github.com/dgrtwo/data-screencasts Timestamped...
I'll analyze a dataset about NYC elevators, without looking at the dataset in advance.
The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-12-06
Code: https://github.com/dgrtwo/data-screencasts
Timestamped annotations of specific tricks, tips and tools used: https://github.com/dgrtwo/data-screencasts/tree/master/screencast-annotations
I'll analyze a dataset about web page metrics, without looking at the dataset in advance. The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-11-15 Code: https://github.com/dgrtwo/data-screencasts...
I'll analyze a dataset about web page metrics, without looking at the dataset in advance.
The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-11-15
Code: https://github.com/dgrtwo/data-screencasts
Timestamped annotations of specific tricks, tips and tools used: https://github.com/dgrtwo/data-screencasts/tree/master/screencast-annotations
I'll analyze a dataset about horror movies, without looking at the dataset in advance. The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-11-01 Code: https://github.com/dgrtwo/data-screencasts Timestamped...
I'll analyze a dataset about horror movies, without looking at the dataset in advance.
The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-11-01
Code: https://github.com/dgrtwo/data-screencasts
Timestamped annotations of specific tricks, tips and tools used: https://github.com/dgrtwo/data-screencasts/tree/master/screencast-annotations
I'll analyze a dataset about Bigfoot, without looking at the dataset in advance. The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-09-13 Code: https://github.com/dgrtwo/data-screencasts Timestamped...
I'll analyze a dataset about Bigfoot, without looking at the dataset in advance.
The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-09-13
Code: https://github.com/dgrtwo/data-screencasts
Timestamped annotations of specific tricks, tips and tools used: https://github.com/dgrtwo/data-screencasts/tree/master/screencast-annotations
I'll analyze a dataset about LEGOs, without looking at the dataset in advance. The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-09-06 Code: https://github.com/dgrtwo/data-screencasts Timestamped...
I'll analyze a dataset about LEGOs, without looking at the dataset in advance.
The dataset comes from the Tidy Tuesday project: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-09-06
Code: https://github.com/dgrtwo/data-screencasts
Timestamped annotations of specific tricks, tips and tools used: https://github.com/dgrtwo/data-screencasts/tree/master/screencast-annotations