So someone in your lab has asked you to write a Bash script, or maybe you’ve heard about Bash and wonder what it’s about. This is the first of a set of tutorials designed to teach you how to automate some of your data analysis processes while learning fundamentals of Bash. The intended audience is both beginners and novices, whether you’re not sure what Bash is or you’re hoping to reduce the labor of data processing. However, if you already have your Bash environment set up you might want to go ahead to Part 1 on scripting. After completing this first tutorial you should have an idea of what Bash is, how to navigate a file system, and how to create and run Bash scripts.
If you’re a Windows user, you may have heard that Microsoft is bringing Bash to Windows but you might have limited experience interacting with Bash or may not even be sure what it is. Bash is still in the developer preview accessible only through the Microsoft Insider program, if you already have developer build 14316 or later installed you can enable Bash.
However, the Insider build and Bash for Windows are still in development and I wouldn’t recommend them just yet. It’d probably be best to install Linux alongside Windows or spin up a Virtual Machine to work along with these tutorials.
Through this three part series you will learn to parse web pages, clean data, and automate processes by writing Bash scripts that make use of several UNIX tools. I’ll do my best to provide useful resources that you can use to further explore the content and use in your future projects. If you have any questions, suggestions, or requests for content leave a message in the comment section.
What is Bash?
Bash stands for Bourne again shell and is a free shell written for the GNU operating system and is maintained by the Free Software Foundation. A shell is a special program which allows a user to interact directly with the operating system through their keyboard and therefore is an essential tool for any programmer. You may already be familiar with the look of Bash popularized in movies and other media. Bash is also a scripting language that we will use to communicate our commands to the operating system.
Let’s begin by starting up a Bash session of our own. On OS X and Linux, the shell can be accessed through a program called Terminal. A useful hot-key for this is Ctrl+Alt+t on Linux, if you’re running OS X you will have to manually set this hot-key.
We’re ready to write our first script, we’ll start off simple with writing text to the screen (all the following examples require you type the text following the $ in the terminal hitting enter at the end of the line):
$ echo "Hello World"
The default output of the echo and many other programs is to standard output, typically your monitor. We can just as easily print the results to a file using the “>”, standard out redirect command:
$ echo "Hello World" > hello
We can then print the contents of the file to standard out by using a program called cat, short for concatenate:
$ cat hello
Navigating the File system
Let’s go ahead and clear the screen using:
An important piece of information to remember is that in Unix-like operating systems (*nix) everything is a file, from photos to folders (also known as directories are just files with lists of other files), even Bash is a file residing within what is collectively referred to as the file system. To find Bash’s location in the file system, we can type:
$ which bash
Knowing this piece of information will be important later, but at this point you are probably wondering where you are? Whenever you need the answer to that question there is a simple command to let you know, pwd an acronym for print working directory:
Let’s go ahead and create a folder to store all our scripts:
We can view the files in our directory by using the ls command, short for list:
We can change into this directory by using the cd command:
To get back to your home directory just type cd without any arguments.
Text Editor Setup
Once we start writing useful scripts, we’re going to want to save these for later use. To do this we’ll store our commands in a file. In order to edit any file on your computer you will need a text editor, my personal preference is vim but any text editor will work although its rather idiosyncratic. To get started quickly, I’d recommend using pico. If you choose to use pico, you can skip ahead to the editing files section and just replace “vim” with “pico” in the commands (also ignore hotkeys). If you are feeling adventurous, you can check if vim is installed by typing:
$ which vim
The output should look something like the second line. If you don’t see anything then you’ll need to download vim. To download vim on Ubuntu you would type:
$ sudo apt-get install vim
If you are new to vim, I highly recommend taking a couple minutes to learn the basics by using the built-in vim tutorial:
The tutorial will walk you through the commands you need to know. Once you’ve finished the tutorial we’re ready to start writing our first files.
Let’s go ahead and create our first bash script. First we’ll need to open a new file:
$ vim hello.sh
Shell scripts are of the format filename.sh, where the .sh gives the operating system a hint at how to interpret our file. We’ll start this script the way we will start all Bash scripts by letting the operating system know the location of Bash. To enter insert mode, press I then enter the following:
The “#!” are known collectively as a shebang in Unix and let the program loader know we want to send whatever follows to the interpreter located at the path specified, in our case “/bin/bash”. Let’s add our greeting program from earlier:
echo "Hello World"
There’s a joke that you’ll eventually learn vim because you can’t figure out how to close it, but I’ll save you the agony. First, let’s enter back into normal mode, by pressing Esc. Now we can enter commands. There are two commands in vim that you will use most frequently
:q, which save and quit respectively. We can combine these two in one command that we’ll send to vim now:
Running Our Script
We’re almost able to run our script, but first we have to give the file executable access:
$ chmod +x hello.sh
Now we can run our script:
Congratulations on writing your first Bash script!
Will Kearns, NLM Bioinformatics Trainee
I live in Seattle, Washington and attend the University of Washington as part of the Biomedical and Health Informatics Ph.D. program. My research interest is in consumer-health question answering and machine comprehension for the design and testing of conversational agents.