Skip to content

Basic Unix Commands

There are many bash commands, but this short guide is meant to cover the essentials, either as an introduction for those unfamilar with Unix-based terminals or as a refresher for those more experienced.

Basic Keystrokes and Characters

keystroke / character function
uparrow and downarrow scroll through recently used commands
tab autocomplete commands or file names / directories
ctrl-c kill current process
. hidden file prefix or current directory
~ home directory
../ up one directory
| pipe for chaining multiple commands
> output redirect
* wild card character (e.g., use it find all files in the current directory that are python scripts $ ls *.py)
$ indicate a variable in bash (e.g., $HOME)

Foundations

To navigate around the terminal you will need to know some basic commands for directory manipulation, file interaction, and process monitoring. To find out more information about any command, take advantage of the manuals in Unix by entering man command_name.

  • file and navigation fundamentals:

    • cd: change directories (folders)
    • mkdir: create a directory
    • ls: list all files in a given directory
    • cp: copy a file from source to target
    • scp: same as cp except between two machines over ssh
    • rsync: more sophisticated, optimized (but also slightly more complex) way to copy files either locally or over a network (functionality of both cp and scp)
    • mv: move a file from source to destination
    • rm: delete a file, add -r to delete directories
    • pwd: print the current working directory (and its path from the root)
    • chmod: change file permissions
    • chgrp: change the group associated with a file or directory
  • probing file contents:

    • cat: print the entire contents of a file to the command line. This command is best used to view small files (only a few lines long) or as part of a longer, more complex command.
    • wc: counts up the words in a file, or lines when using -l
    • head and tail: prints the start or end of a file to the command line
    • less: efficent way to view large files as it loads in only a small portion of a file at a time. Once open use arrow keys to scroll, G to navigate to the end of a file, g to the beginning, / to search, and q to exit.
  • process basics:

    • ps: list currently running processes
    • kill: terminate a process
    • htop: opens a more sophisticated process manager - analogous to a text-based version of Windows task manager or Mac's Activity Monitor.
  • other useful commands:

  • du -h: check disk usage (the -h flag displays usage in human readable format)

  • wget: download a file from a given url
  • curl: very similar to wget with slightly more options for interacting with remote servers; see curl vs wget for more.

Advanced Commands

The following are slighly more powerful commands that require more explanation. We highlight the basic commands for a quick start.

screen

When a program starts running in a shell, it must complete before it is free to run another command. Opening another shell window solves this, but when working remotely on the servers there is more overhead to starting a new window because this requires setting up another SSH connection. This is where the screen command comes in handy. Screens allow you to have multiple virtual shells—so you can start a long process in one shell, let it run, and use another shell simultaneously. Screens are also useful because they persist regardless of connection status. This means that if an SSH connection dies midway through execution, the process started in a screen will keep running and not terminate, and this also means that connections from other machines can reattach to any existing screen.

You can read more about the screen tool here but the following are the most basic commands you will need to get started.

start a new screen window with:

$ screen
Anything done in this shell window will now be in this new screen.

Exit a screen session by pressing ctrl-a followed by d.

To reload a screen you need to reattach it by giving it the id of the screen. See the list of currently running screens:

$ screen -ls
There is a screen on:
    447607.pts-3.risotto    (Detached)
1 Socket in /run/screen/S-username.

Then scan the output for the id and reattach using the -r flag

$ screen -r 447607
and you can pick up where you left off.

If you only have one screen running, using

$ screen -r
directly will just reattach the one screen session.

Note

screen has basically all the functionality you need to run remote processes, but if you are interested, there is a related tool, tmux, that is more powerful (e.g., more functionality to handle multiple sessions, split window panes within the same terminal), but also has a slightly higher learning curve. You can read more about it here.

awk

Awk is a powerful, lightweight text processing tool that enables basic scripting straight from the command line. There are many potential use cases, the below are only a taste. Check out awk intro and handy one line awk for even more.

Awk is specifically designed to interact with tabular data, and by default splits on whitespace (tabs and spaces).

Note

You can override this behavior by setting a custom delimeter with the -F flag.

This makes it easy to isolate and print out particular columns of a file. For example, this command prints out the second and third columns of a file named yourfile.txt:

$ awk '{ print $2, $3 }' yourfile.txt

The $2 and $3 here are variables that refer to the 2nd column and 3rd columns. You can extend this to reference any column in a file. Another useful built-in variable is NF, which gives the number of columns in a given row. With this, you can print the number of columns per line in a file:

$ awk '{ print NF }' yourfile.txt

Awk also supports conditionals and loops. Suppose you only wanted to see lines of your file where the number of columns deviates from what you expect. In this example, we expect our file (yourfile.txt) to have 5 columns per line.

$ awk 'NF!=5 { print $0 }' yourfile.txt

You can also chain other commands using the pipe character (|) to unleash even more functionality. For example, consider the case where we want to rename all files in the subdirectory, my_subdir, so that they end in .old.

$ ls my_subdir | awk '{print "mv "$1" "$1".old"}' | sh

Breaking down the command—ls my_subdir lists all the files in a subdirectory, sending the names to awk, which will echo a command mv $1 $1.old for every line / filename that is piped to the shell sh for execution.

sed

Sed is similar to awk, but focuses on text manipulation (versus awk is more naturally useful for extracting data and reporting it). To be able to really use sed, you need to be first comfortable with regular expressions. Regular expressions (aka regex for short) are a whole other topic worth investing in, as basic knowledge of them can save you lots of time. And they are handy to use outside of the command line in certain IDEs like SublimeText and Atom! This is a good tutorial to get started with basic regex syntax.

Once you are comfortable with regex you can use sed to do replacements in a file. The basic formula is s/regex-needle/replacement/{flags}.

$ sed -e 's/Axl/AXL/g' yourfile.txt

The above will replace all instances of gene name Axl, with AXL, the flag g stands for global meaning replace all matches in the input.

For more about sed check out this sed tutorial or browse useful one liners.

grep

Grep is the one stop shop to search files and file names. You can use grep with and without regular expressions.

It is easy to search a file for a specific word:

$ grep search_term yourfile.txt

or even count the number of times a term occurs in a file:

$ grep -c search_term yourfile.txt

Grep is also super useful for searching all files in a directory just simply pass it the -r flag for recursive search. Check out the manual man grep for a full listing of flags, and this tutorial for basic regex terms with grep.

sort

Unix uses a variant of the merge sort to order input. The sort command assumes that the input is structured in a tabular form by default separating on whitespace. Sorts take a -k flag to indicate the column to sort by and will sort by characters unless given the -n flag to indicate a numeric sort.

As a simple example we use sort to order yourfile alphabetically by the second column and save to another file called yourfilesorted.txt.

$ sort -k2 yourfile.txt > yourfilesorted.txt

We have found sort to be the most useful when it is chained with other commands. Like this one for sorting the first 5 lines of a file by column 2:

$ head -n 5 yourfile.txt | sort -k2

A useful command to combine with sort is uniq, which collapses continguous duplicates, so this allows you to easily collapse a big list with repeating elements and examine unique components.


Last update: 2020-06-11