The unix command line, although invented decades ago, is an amazing environment for efficiently performing tedious but essential data science tasks. By combining small, powerful, command-line tools (like
csvkit), you can quickly scrub and explore your data and hack together prototypes.
This hands-on workshop is based on the O’Reilly book Data Science at the Command Line, written by our CEO Jeroen Janssens. You’ll learn how to build fast data pipelines, how to leverage R and Python at the command line, and how to quickly visualise data. No prior knowledge about the unix command line is required.
By the end of this workshop you will have a solid understanding of how to integrate the command line in your data science workflow. Even if you’re already comfortable processing data with, for example, R or Python, being able to also leverage the power of the command line can make you a more effective and efficient data scientist.
- Case study
- Setting up the Data Science Toolbox
- Essential concepts of the Unix command line
- Classical filters such as
- Accessing APIs using
- Processing CSV with
- Processing JSON with
- Running R from the command line
- Basic data visualisation using R and Bash
- Executing SQL queries directly on CSV data
- Parallelising data-intensive pipelines using GNU Parallel
- Where to go from here
We have previously delivered this workshop (or a custom training based on this workshop) at the following organisations:
Photos and Testimonials
“Great workshop! Very well done and very useful information delivered in an excellent and interactive manner. Jeroen anticipated very well on the different knowledge levels within the group. I would highly recommend the Data Science at the Command Line workshop to anyone that is interested in either kickstarting their command-line experiences or improving their data science with Unix power tools.”
“As a seasoned UNIX command line adept, I didn’t expect to learn much from a Data Science at the Command Line workshop. I was wrong! Over the years, many new tools have become available that I didn’t know about, and that can be combined with traditional tools in new ways.
Since attending the workshop, I have been able to simplify and improve the efficiency of many of the scripts I use on a daily basis. Recommended for anyone working from the command line, newbies and ninjas alike!”