An introduction to Snakemake tutorial for beginners (CC248)

Published 2022-09-15
Snakemake is a powerful tool for keeping track of data dependencies and to automate data analysis pipelines. In this episode of Code Club, Pat Shares how to install snakemake, convert a driver script to a simple Snakemake file, troubleshoot problems, create rules, use parameters, and test snakemake files. The overall goal of this project is to highlight reproducible research practices using a number of tools. The specific output from this project will be a map-based visual that shows the level of drought across the globe.

You can find my blog post for this episode at www.riffomonas.org/code_club/2022-09-15-snakemake.

#snakemake #conda #bash #R #Rstats

Support Riffomonas by becoming a Patreon member!
www.patreon.com/riffomonas

Want more practice on the concepts covered in Code Club? You can sign up for my weekly newsletter at shop.riffomonas.org/youtube to get practice problems, tips, and insights.

If you're interested in taking an upcoming 3 day R workshop be sure to check out our schedule at riffomonas.org/workshops/

You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: www.riffomonas.org/minimalR/
General data: www.riffomonas.org/generalR/

0:00 Introduction
3:30 Our first Snakemake rule
6:41 Installing snakemake with conda/mamba
9:33 Testing snakefile with --dry-run or -np
18:43 Creating and using a targets rule
21:13 Running snakefile
25:08 Visualizing the DAG
27:49 Cleaning up

All Comments (13)
  • fantastic and straight to the point introduction to Snakemake. Great Job! 🙌👏
  • @danhatechav
    Fantastic! Thank you very much. Looking forward to watching this one.
  • @sven9r
    Hey Pat, I don't know how you do it, but we are currently working on a wiki for our lab. Where we are creating tutorials for projects controlled via conda, snakemake and gith. So this series has so much value for me and the other Phd students! Cheers !
  • @geparada88
    This is a good intro video! I will show this to a rotation student that seems to be interested in learning snakemake - Thanks a lot. The only thing I found a little bit confusing is that for running your script you required {params.file} instead of directly using {output}. I guess this is because your script is just taking the name of the output file and automatically saving this in a folder called "data". Perhaps you could have explained this, as most of the time you don't need to specify output files as {params}.
  • @musicspinner
    Fantastic. 👌🏽 This is incredibly useful. Related to this, have you worked with R `library(drake)` or its successor `library(targets)`? If so, any thoughts related those R-specific workflow management systems? I could see them being quite handy for improving Shiny app efficiency or just interactive R exploration of some computationally heavy analysis.
  • @ricrocha821
    Hi!! The videos are incredible. Learning so much!! I would like to mention we need to install snakemake extension on VS code to run the scripts. Thanks a lot!!
  • @etterathe
    Do you use docker for reproducible research in R? When to choose snakemake and when docker to intialize functioning environment? Could both of them cooperate?
  • @patricior7300
    Please, Could You explain in a video, how to show in vs Code, something like environment panel in RStudio? Tnx