Software Carpentry

Software Carpentry workshops are free, hands-on, two-day events that cover the core computational research skills needed to be productive as an individual scientist or in small research teams. No previous experience is needed and each workshop is taught by experienced researchers who use these tools in their own work. Participants will learn how to automate tasks using the Unix shell, will be introduced to a structured programming language like Python or R, and will gain experience using Git and Github for version control.

The Unix Shell (online) - Register
Tuesday, March 1, 2022. 1:00 - 4:00 p.m. 
Learn the basics of file systems and the Unix shell. The shell is a tool that allows you to do complex things with just a few keystrokes. More importantly, it helps you to combine existing programs in new ways and automate repetitive tasks. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including “high-performance computing” supercomputers). These lessons will start you on a path towards using these resources effectively. (View curriculum)

Databases and SQL (online) - Register
Tuesday and Wednesday, March 22 and 23, 2022. 9:00 - 11:30 a.m. 
This session focuses on using databases and SQL for search and analysis of large or complex data sets. Learn to write queries in SQL, which stands for “Structured Query Language”. SQL provides hundreds of different ways to analyze and recombine data. We will only look at a handful of queries, but that handful accounts for most of what scientists do. (View curriculum)

Data Cleaning with OpenRefine (online) - Register
Tuesday, March 29, 2022. 1:00 - 4:00 p.m.
OpenRefine is a powerful open-source tool to help you work with messy data. We’ll show you how to explore and transform datasets using facets, filters, and clustering tools in OpenRefine. Researchers can take advantage of OpenRefine’s flexible spreadsheet-like interface in a reproducible and extensible framework. No previous experience is required. (View curriculum)

Software Carpentry: Programming with Python, The Unix Shell, and Version Control with Git (in-person) - Register
Wednesday and Thursday, April 13 - 14, 2022. 9:00 a.m. - 4:30 p.m.
Andersen Library, Room 120 (UMN West Bank)

This hands-on two-day workshop will cover core computational research skills needed to be productive as an individual scientist or in small research teams. Participants will learn how to automate tasks using the Unix shell, will be introduced to the Python programming language, and will gain experience using Git and Github for version control. No previous programming experience is required! (View curriculum: Python, Unix, Git)

R for Reproducible Scientific Analysis (online) - Register
Monday, Tuesday, and Wednesday, April 18 - 20, 2022. 9:00 a.m. - 12:00 p.m.
This workshop series will introduce modular code and best practices for using R for data analysis. R is commonly used in many scientific disciplines for statistical analysis and its array of third-party packages. These workshops provide a strong foundation in the fundamentals of R, and introduce reading and manipulating data in data frames, creating data plots in ggplot2, and more. (View curriculum)

Past Workshops

Programming with R

The best way to learn how to program is to do something useful, so this three-day introduction to R is built around a common scientific task: data analysis. Our real goal isn’t to teach you R, but to teach you the basic concepts that all programming depends on. Learn how to read data into R, assign variables, use data frames, plot data, make functions, and more.


Programming with Python

An introduction to programming in Python for people with little or no previous programming experience, and  is built around a common scientific task: data analysis. Learn about Python data types, variables, visualizing tabular data, for loops, Python lists, and functions, while using Python packages numpy and matplotlib.


Databases and SQL

This session focuses on using databases and SQL for search and analysis of large or complex data sets. Learn to write queries in SQL, which stands for “Structured Query Language”. SQL provides hundreds of different ways to analyze and recombine data. We will only look at a handful of queries, but that handful accounts for most of what scientists do.


Version Control with Git

This lesson introduces version control with Git, which provides researchers with a tool they can use to keep track of what they’ve done and collaborate with other people. Every large software development project relies on version control, and most programmers use it for small jobs as well. And it isn’t just for software: books, papers, small data sets, and anything that changes over time or needs to be shared can and should be stored in a version control system.


Data Cleaning with OpenRefine

OpenRefine is a powerful open-source tool to help you work with messy data. We’ll show you how to explore and transform datasets using facets, filters, and clustering tools in OpenRefine. Researchers can take advantage of OpenRefine’s flexible spreadsheet-like interface in a reproducible and extensible framework.

Automating tasks using the Unix shell

• Structured programming in Python, R, or MATLAB

• Version control using Git or Mercurial

• Enables computational research that couldn’t be done otherwise

• Improves reproducibility and rigor by automating repetitive tasks and allowing sharing and auditing of data collection, processing, and analysis scripts

• Helps improve the quality of shared research data and code

• Graduate Students

• Faculty

• Staff who do research