Lesson Material

Here is a list of current and upcoming events from the Galter DataLab. If you would like to request a workshop or discussion topic, please let us know!

Do you have expertise on a topic and would you like to share it with your peers? We welcome collaborators! Whether you’re a beginner, intermediate, or expert, we want to work with you!

Contact us at datalab@northwestern.edu.


Best Practices in Research Data Management and Data Sharing

An introduction to basic concepts in research data management including University retention requirements, data management plan requirements, data documentation, file naming conventions, metadata, and sharing research data.

Upon completion of this one-hour workshop, participants will:

  • Understand and be able to apply best practices for file naming and documentation
  • Be familiar with basic tidy data best practices
  • Be familiar with metadata best practices
  • Understand and be able to locate online Federal funder requirements for data sharing
  • Be familiar with publication and data sharing tools available both at Northwestern and through the Web

Data Cleaning with OpenRefine

An introductory class in OpenRefine, a free, open-source tool for cleaning data in spreadsheets. No coding knowledge is needed. Familiarity with concepts such as data records and values is helpful.

Upon completion of this 90-minute workshop, participants will:

  • Understand how to facet and transform data values
  • Understand how to write simple data transformations
  • Understand how to retrieve data from APIs
  • Understand how to reconcile data against controlled data sources

Introduction to RStudio

This introduction to RStudio covers the basic questions ‘What is RStudio?’ and ‘Why should I use it?’ as well as the main components of the interface. The intended audience includes those new to statistical or data analysis software as well as users of SAS, Stata, SPSS, or other packages.

Data Cleaning and Organization with R

Data cleaning, processing, and munging can be a very time consuming processes. You can save time by developing a reproducible workflow for these tasks. Taking deliberate steps on the front end of your project to properly process your data will…

  • help you become familiar with your data and any quality issues that may exist, and
  • save you from headaches down the road.

In this workshop we’ll present an outline that you can follow to help you with your day-to-day data organization tasks. Starting with a new, raw, tabular data set, we will follow these steps to learn more about it and clean up where we need to so we analyze it properly:

  1. Inital Exploration
  2. Fixing errors
  3. Standardizing values
  4. Dimensionality reduction: Can you get rid of any columns?
  5. Visualization
  6. Creating a data dictionary
  7. Descriptive Statistics and Exploratory Data Analysis

Introduction to the Command Line / Bash Part I

The shell is a useful operating system to most researchers who are doing any type of programming. The Unix shell is powerful and often the fastest and most direct way to work with files, folders, executing programs, etc. Also, most programmers operate in this OS, due to the simplicity and control over the system. In part I, users will be introduced to the command line and learn basic commands.

  • This class is 1.5 hours long.
  • Users must bring their own laptops to this class. Library computers do not have the Bash shell installed.
  • Attendees will learn how to:
    • launch the Bash shell
    • execute commands for changing file directories, adding and deleting files
    • launch text editing from the command line

Introduction to the Command Line / Bash Part II

In Part II of Intro to the Command Line, we will explore how to search within files, and write simple bash scripts.

  • This class is 1.5 hours long.
  • Users must bring their own laptops to this class. Library computers do not have the Bash shell installed.
  • Users will learn:
    • how to string together commands with pipes
    • how to search within files
    • how to use loops and simple scripts

Additional Courses

For additional courses and workshops, visit the Galter Health Sciences Library & Learning Center website

Workshops From Our Partners

The instructional materials below have been provided by our collaborators at OHSU and elsewhere.

The Magic of Markdown

This workshop introduces you to Markdown, an easy to learn format to save written documents in. Once you have a document in Markdown, you can transform it into all sort of outputs: from webpages, slides, PDF documents, and even Reproducible Analysis.

Setting up a GitHub Page

Once you know Markdown, you can now setup your own personal GitHub Page, with your academic CV and other accomplishments! We cover how to personalize it and add posts in Markdown.

Git for Collaboration

This is a workshop introducing you to Git and GitHub. Learn the basics of Git by sorting panels from Edward Gorey’s Gashlycrumb Tinies.

Discussion Topics

This is a list of discussion topics and talks hosted by BioData Club that we think have been successful and of interest to other BioData Club Groups:

This is a paper by Greg Wilson that outlines how to improve your scientific software.

This paper describes some of the lessons about data storage learned from instructors for the Software and Data Carpentry initiatives.

This paper describes some basic best practices for reproducible research.

We had a discussion about Jeff Leek’s book How to be a Modern Scientist and what it means for students and postdocs today.

Alison Presmanes Hill gave a talk for our visualization hacky hour about how she slowly revised and improved a figure. Very funny and very informative.

We at BioData Club are cross-disciplinary by nature. What does it take to be a good cross-disciplinary collaborator?

Events/Workshops by Date