Knowledge Base/Unix/General
A summary of some useful Linux/Unix commands
chown -R apache:apache /path/to/directory
Change ownership of /path/to/directory recursively to user apache, group apache (first user, then group).
tar czf /path/to/output.tar.gz /path/to/input/directory/
Create a *.tar.gz archive file from a directory.
du -a /path/to/directory | sort -n -r | head -n 10
List the top 10 files and directories by size in a directory tree. By default, the units are 1024 bytes (1 K).
mysqladmin -u root -p status
MySQL server status. You will be prompted for the password (-p).
mysql -u root -p mysql> show databases
List the databases on a MySQL server. You will be prompted for the password (-p).
mysqldump -u root -p database_name > ~/path/to/output/database_name.sql
Backup a MySQL database database_name to a file. You will be promoted for a password (-p).
Finding out which platform / configuration you are using
Use
uname --all
This will "print certain system information". For example,
CYGWIN_NT-6.0-WOW64 PB-MacBookPro 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin
Finding out which shell you are using
Inspect the SHELL environment variable. The command
echo $SHELL
will display its value. Look at the last part of the pathname:
- /bin/sh - Bourne shell
- /bin/bash - Bourne Again SHell
- /bin/csh - C shell
- /bin/ksh - Korn shell
- /bin/tcsh - TC shell
- /bin/zsh - Z shell
Which shell is best for me?
The following has been reproduced from unix-faq/shell/shell-differences ver. 1.17:
| Feature | sh | csh | ksh | bash | tcsh | zsh | rc | es |
|---|---|---|---|---|---|---|---|---|
| Job control | N | Y | Y | Y | Y | Y | N | N |
| Aliases | N | Y | Y | Y | Y | Y | N | N |
| Shell functions | Y(1) | N | Y | Y | N | Y | Y | Y |
| "Sensible" Input/Output redirection | Y | N | Y | Y | N | Y | Y | Y |
| Directory stack | N | Y | Y | Y | Y | Y | F | F |
| Command history | N | Y | Y | Y | Y | Y | L | L |
| Command line editing | N | N | Y | Y | Y | Y | L | L |
| Vi Command line editing | N | N | Y | Y | Y(3) | Y | L | L |
| Emacs Command line editing | N | N | Y | Y | Y | Y | L | L |
| Rebindable Command line editing | N | N | N | Y | Y | Y | L | L |
| User name look up | N | Y | Y | Y | Y | Y | L | L |
| Login/Logout watching | N | N | N | N | Y | Y | F | F |
| Filename completion | N | Y(1) | Y | Y | Y | Y | L | L |
| Username completion | N | Y(2) | Y | Y | Y | Y | L | L |
| Hostname completion | N | Y(2) | Y | Y | Y | Y | L | L |
| History completion | N | N | N | Y | Y | Y | L | L |
| Fully programmable Completion | N | N | N | N | Y | Y | N | N |
| Mh Mailbox completion | N | N | N | N(4) | N(6) | N(6) | N | N |
| Co Processes | N | N | Y | N | N | Y | N | N |
| Builtin artithmetic evaluation | N | Y | Y | Y | Y | Y | N | N |
| Can follow symbolic links invisibly | N | N | Y | Y | Y | Y | N | N |
| Periodic command execution | N | N | N | N | Y | Y | N | N |
| Custom Prompt (easily) | N | N | Y | Y | Y | Y | Y | Y |
| Sun Keyboard Hack | N | N | N | N | N | Y | N | N |
| Spelling Correction | N | N | N | N | Y | Y | N | N |
| Process Substitution | N | N | N | Y(2) | N | Y | Y | Y |
| Underlying Syntax | sh | csh | sh | sh | csh | sh | rc | rc |
| Freely Available | N | N | N(5) | Y | Y | Y | Y | Y |
| Checks Mailbox | N | Y | Y | Y | Y | Y | F | F |
| Tty Sanity Checking | N | N | N | N | Y | Y | N | N |
| Can cope with large argument lists | Y | N | Y | Y | Y | Y | Y | Y |
| Has non-interactive startup file | N | Y | Y(7) | Y(7) | Y | Y | N | N |
| Has non-login startup file | N | Y | Y(7) | Y | Y | Y | N | N |
| Can avoid user startup files | N | Y | N | Y | N | Y | Y | Y |
| Can specify startup file | N | N | Y | Y | N | N | N | N |
| Low level command redefinition | N | N | N | N | N | N | N | Y |
| Has anonymous functions | N | N | N | N | N | N | Y | Y |
| List Variables | N | Y | Y | N | Y | Y | Y | Y |
| Full signal trap handling | Y | N | Y | Y | N | Y | Y | Y |
| File no clobber ability | N | Y | Y | Y | Y | Y | N | F |
| Local variables | N | N | Y | Y | N | Y | Y | Y |
| Lexically scoped variables | N | N | N | N | N | N | N | Y |
| Exceptions | N | N | N | N | N | N | N | Y |
Key to the table above
- Y - Feature can be done using this shell.
- N - Feature is not present in the shell.
- F - Feature can only be done by using the shells function mechanism.
- L - The readline library must be linked into the shell to enable this Feature.
Notes to the table above
- This feature was not in the orginal version, but has since become almost standard.
- This feature is fairly new and so is often not found on many versions of the shell, it is gradually making its way into standard distribution.
- The Vi emulation of this shell is thought by many to be incomplete.
- This feature is not standard but unoffical patches exist to perform this.
- A version called 'pdksh' is freely available, but does not have the full functionality of the AT&T version.
- This can be done via the shells programmable completion mechanism.
- Only by specifing a file via the ENV environment variable.
Customising your working environment
When you log into the system, it runs the system profile file, /etc/profile. Usually this file is readable by all users, writable exclusively by the administrator.
After the system profile, the user profile file(s) is/are run from the user's home directory. These are shell-specific. They are essentially shell scripts that you can edit to customise your environment, e.g. set the environment variables. Here is a list of user profile scripts for different shells:
- Bourne shell: .profile
- Bourne Again SHell: .bash_profile
- C shell: .login and .cshrc
- Korn shell: .profile
- TC shell: .login, .tcshrc and .cshrc
- Z shell: .zlogin and .zshrc
What's the difference between the multiple user profile files?
.cshrc (.tcshrc, .zshrc) is executed when the user first logs into the system and each time a sub shell is spawned (a shell script is executed).
TC shell looks for .tcshrc first. If one is not found, it looks for and executes .cshrc.
.login (.zlogin) is executed when the user logs into the system immediately after .cshrc (.tcshrc, .zshrc). However, it is not executed when a sub shell is spawned.
.login (.zlogin) should contain commands which only need to happen at login time, not for each shell invocation. Thus it would be a bad idea to redefine your path variable in .login (.zlogin) and not in .cshrc (.tcshrc, .zshrc) because sub shells (including X Windows other than your initial login window) would not be aware of this new path.
In general, you will customise .cshrc for C shell, .tcshrc for TC shell, .zshrc for Z shell.
Notes
- C shell and TC shell will also run /etc/csh.cshrc, /etc/csh.login at login and /etc/csh.logout. These are likewise global (affect all users) and are edited by the administrator.
- Such "dot"-files are not normally displayed by ls. To view them, you should use
ls -la
Finding out which groups a given user belongs to
The command is groups [ user ... ]. To discover your own group membership, simply enter
groups
Checking how much space is occupied by a given directory tree
A good command to know is
du -k .
This output something like the following:
9940 ./2008.08.27/MarketDepth 9944 ./2008.08.27 9804 ./2008.08.28/MarketDepth 9808 ./2008.08.28 8452 ./2008.08.29/MarketDepth 8456 ./2008.08.29 8 ./2008.08.30/MarketDepth 12 ./2008.08.30 964 ./2008.08.31/MarketDepth 968 ./2008.08.31 58356 .
Where 58356 is the total size of the directory tree ".".
The parameter -k tells du to use the unit of 1 kilobyte (1024 byte) in its output. A useful alternative is -h which tells it to print sizes in human readable format (e.g., 1K, 234M, 2G).
Finally, if your directory tree is very deep, it may be worth specifying --max-depth=N. This will print the total for a directory only if it is N or fewer levels below the command line argument.
The following form is particularly useful:
du -h --max-depth=1 .
This will print something like
279G ./Dir1 30G ./SomeOtherDir 69G ./Foo 624G ./Bar
Giving the totals for the first-level subdirectories.
Checking how much free space is available in a given directory (on a given mount)
For the current directory, use
df -b .
This should give you something like
Filesystem avail somesys123:/vol/apps_123/blahblahblah/user 319407932
Where 319407932 is the number of bytes free.
Creating a symbolic link
Somehow the syntax for this command is easily forgotten. It is:
ln -s [TARGET DIRECTORY OR FILE] ./[SHORTCUT]
For example:
ln -s /usr/local/apache/logs ./logs
Checking if a process is listening on a specific port
E.g. if you are worried about port 19010, you can do the following:
netstat -ltnup | grep 19010
You may see something like this:
tcp 0 0 0.0.0.0:19010 0.0.0.0:* LISTEN 5060/q
Here 5060 is the process ID.
Listing the contents of a *.tar or *.tar.gz file
tar tvf filename.tar
Extracting a *.tar or *.tar.gz file
gzip -d filename.tar.gz tar xvf filename.tar
Using sort to make sense of the output of other commands, such as ps and ls
sort does exactly what it is supposed to do: sorts. Consider this example:
ps aml | sort -r -k 7 | more
Here the "process status" command (ps) lists all processes with memory information in long format, and standard output from this command is piped into sort, where it is reverse sorted (-r) on the seventh column (-k 7), and then is "piped" into more for easy viewing by the user.
Along the same lines,
ls -al | sort -g -r -k 5 | head -10
will report the 10 largest files, sorted by size, in your current dirrectory. (-g tells it to use the general numeric sort rather than the dictionary order sort.)
How can I find out the user ID of the currently logged in user? Is there a difference between who am i and whoami?
The answer to the first part of the question follows from the question itself: use who am i or whoami.
The answer to the second part of the question is, yes:
who am i
displays
bilokon pts/2 Jan 26 14:42 (foo.bar.baz.com)
whereas
whoami
displays simply
bilokon
The latter may be more useful in backticks:
echo I am `whoami`
displays
I am bilokon
I want to create a directory foo/bar/baz but I don't know whether foo and foo/bar exist
Use
mkdir -p foo/bar/baz
This will create foo, foo/bar, and then foo/bar/baz and do nothing (without complaining) if they already exist.
Obtaining shell script's full path
Can a shell script determine the full path to itself? Yes, as follows:
#!/bin/csh echo "My path:" echo "--------" echo `readlink -f $0`
Or, if you want to save this path,
set DIR_THIS_SCRIPT=`readlink -f $0` echo $DIR_THIS_SCRIPT
What if we want the path to the containing directory?
set PATH_THIS_SCRIPT=`readlink -f $0` echo $PATH_THIS_SCRIPT set PATH_THIS_SCRIPT_DIR=`dirname $PATH_THIS_SCRIPT` echo $PATH_THIS_SCRIPT_DIR
Finding all files in a given directory subtree that match a specified pattern
You can use something like this:
find -name "*.gz"
If you are in /foo, this will look inside /foo, /foo/bar, /foo/bar/baz, etc.
If you want to count the files, use
find -name "*.gz" | wc -l
Listing directory contents one file per line
ls -1
Finding out how much memory there is on a machine
vmstat -s -S k
This will display something like
16807600 k total memory
16743662 k used memory
1967300 k active memory
8966222 k inactive memory
63938 k free memory
117428 k buffer memory
8899985 k swap cache
18227212 k total swap
212 k used swap
18227000 k free swap
11707612 non-nice user cpu ticks
2363 nice user cpu ticks
20877755 system cpu ticks
6209150782 idle cpu ticks
6218002 IO-wait cpu ticks
136901 IRQ cpu ticks
797432 softirq cpu ticks
20214961 pages paged in
987134283 pages paged out
0 pages swapped in
52 pages swapped out
1313489012 interrupts
3283339599 CPU context switches
1217830213 boot time
99998296 forks
where the units are 1000's (NOT 1024's!) or bytes.
Finding out how much memory is used by a process
Assuming that you know the PID (process ID), which you can obtain by running ps aux, you can use
pmap 12570
where you should replace 12570 with your process ID.
Finding all files whose name matches a particular pattern and which contain a given text string
You can use the following:
find . -name "*.sh" -exec grep -q "FOO" '{}' \; -print
This will list the paths of all text files whose name matches "*.sh" and which contain the string "FOO". The output of the command looks as follows:
./file1.sh ./file2.sh ./somedir/yet_another_file.sh
It's obvious from the presence of somedir in the output above that this command will process the current directory and all its subdirectories. If you want to process the current directory alone, use
find . -maxdepth 1 -name "*.sh" -exec grep -q "FOO" '{}' \; -print
If you don't care about the file name but only care about the requirement that the file contain the string "FOO" you should omit -name "*.sh":
find . -exec grep -q "FOO" '{}' \; -print
Finding files that contain both pattern 1 and pattern 2
If your pattern 1 is "FOO" and your pattern 2 is "BAR", then you can use this:
grep -l FOO *.sh | xargs grep -l BAR
Inspecting open files
Note: lsof may be under /usr/sbin, so you may have to invoke it as /usr/sbin/lsof.
Listing all open files
lsof
Listing files open for a particular user
lsof -u somelogin
where somelogin is the user's login.
Listing the users that have a particular file open
lsof /path/to/file
Nailing a process to a particular CPU
There is a useful command called taskset which retrieves or sets a process's CPU affinity.
CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system. The Linux scheduler will honour the given CPU affinity and the process will not run on any other CPUs. Note that the Linux scheduler also supports natural CPU affinity: the scheduler attempts to keep processes on the same CPU as long as practical for performance reasons.
The CPU affinity is represented as a bitmask, with the lowest order bit corresponding to the first logical CPU and the highest order bit corresponding to the last logical CPU. Not all CPUs may exist on a given system but a mask may specify more CPUs than are present. For example,
- 0x00000001 is processor #0
- 0x00000003 is processors #0 and #1
- 0xFFFFFFFF is all processors (#0 through #31)
To find out a process's (say PID 4211) current affinity mask:
taskset -p 4211
This will return something like
pid 4211's current affinity mask: f
(Meaning none has been set.)
To set it:
taskset -p 0x00000003 4211
You can also use taskset with a mask to launch a new command:
taskset 0x00000003 du -k
You should understand that although taskset will nail your process to particular cores it doesn't do anything to block other tasks from using those cores.
You may want to look at all the heavy-hitter processes on the box and taskset them appropriately too.
Learning more about a particular process by its PID
First, you can discover the PID of a pricess by running something like
ps aux
or
top
Once you have the PID, it may be a good idea to explore the /proc hierarchy. It is a veritual file system, sometimes referred to as a process information pseudo-file system. It doesn't contain "real" files but runtime information. Many system utilities are simply calls to files in this directory. E.g. lsmod is the same as cat /proc/modules.
Anyway, assuming your PID is 4211,
cd /proc/4211 ls
You will see a number of entries:
attr cwd fd mem numa_maps stat task auxv environ loginuid mounts root statm wchan cmdline exe maps mountstats smaps status
The most important of these are:
- /proc/PID/cmdline — command line arguments.
- /proc/PID/cwd — link to the current working directory.
- /proc/PID/environ — values of environment variables.
- /proc/PID/exe — link to the executable of this process.
- /proc/PID/fd — directory, which contains all file descriptors.
- /proc/PID/mem — memory held by this process.
- /proc/PID/status — process status in human readable form.
So you can simply
cat /proc/4211/status
And you will see something like this:
Name: runRemoteScript State: R (running) SleepAVG: 0% Tgid: 4211 Pid: 4211 PPid: 1 TracerPid: 0 Uid: 59873 59873 59873 59873 Gid: 2198 2198 2198 2198 FDSize: 256 Groups: 2198 VmSize: 19484 kB VmLck: 0 kB VmRSS: 11228 kB VmData: 12244 kB VmStk: 28 kB VmExe: 768 kB VmLib: 10684 kB StaBrk: 08112000 kB Brk: 088e1000 kB StaStk: ffffd620 kB Threads: 1 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000000006 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000
Basic user and group management
To list all users:
cat /etc/passwd
You will see "proper" users, like
paul:x:500:10::/home/paul:/bin/bash saeed:x:501:10::/home/saeed:/bin/bash thomas:x:502:10::/home/thomas:/bin/bash
but also "system" users:
apache:x:48:48:Apache:/var/www:/sbin/nologin mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/bash
To list all groups:
groups
You will see something like
wheel thalesians_team
Create a new user:
sudo /usr/sbin/adduser nikolai
We have just created a user named nikolai.
Set the user's password:
sudo passwd nikolai
Adding an (existing) user to an (existing) group:
sudo /usr/sbin/usermod -a -G thalesians_team nikolai
The user nikolai has been added to the group named thalesians_team.
To list the user's groups:
groups nikolai
You may see something like this:
nikolai : nikolai thalesians_team
Notice that by default nikolai's primary group is nikolai
To change the user's primary group:
sudo /usr/sbin/usermod -g thalesians_team
Now if you issue the command
groups nikolai
you will see
nikolai : thalesians_team
The first group listed is the primary group.
To delete a user:
sudo /usr/sbin/userdel -r nikolai
-r tells userdel to remove the home directory and mail spool, too.
Replacing a string in a binary file
If you need to replace pattern with its replacement, whether in a text or binary file, you can use perl:
perl -pi -e 's/pattern/replacement/g' file1
Output readability tricks
We shall list information about the files in the current directory, sorting them by modification time. This is done by
ls -t
Chances are this has produced too much output if there are too many files. We may need to filter the results.
If we want to display the first 10 results, we apply
ls -t | head -n 10
This will give us the 10 files that were modified most recently.
If we want to display the last 10 results, we apply
ls -t | tail -n 10
This will give us the 10 files that were modified least recently.
Returning the context of grep's match
Suppose that we have a file example.txt:
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Let's see what happens if we grep it for Thu:
$ grep Thu example.txt Thursday
We have found Thursday, but this may not be enough for us. Suppose that we want to see the context too, i.e. a line or two that surround the matching line.
$ grep -C 1 Thu example.txt Wednesday Thursday Friday
This has returned the matching line as well as the one before and the one that follows. Similarly
$ grep -C 2 Thu example.txt Tuesday Wednesday Thursday Friday Saturday
What does the output look like if we are processing multiple files? In order to test this, we create a copy of example.txt:
$ cp example.txt example_copy.txt
Then
$ grep -C 1 example*.txt
produces this output:
example_copy.txt-Wednesday example_copy.txt:Thursday example_copy.txt-Friday -- example.txt-Wednesday example.txt:Thursday example.txt-Friday
As you can see, the filename appears on each line and a line containing "--" is placed between contiguous groups of matches.
Manipulating the output of Unix commands
Suppose you would like to add up the sizes of several directories:
$ du -k ./2009.02.26/*
which produces the output
12 ./2009.02.26/Foo 12 ./2009.02.26/Bar 8 ./2009.02.26/Baz
We need to add up the numbers in the first column. But first, how do we get the first column? We can use cut:
$ du -k ./2009.02.26/* | cut -f 1
which gives
12 12 8
We can now use awk (a simple pattern scanning and processing language) to add these up:
$ du -k ./2009.02.26/* | cut -f 1 | awk '{a+=$1}END{print a}'
This prints
32
su, sudo and sudoers
su enables users to run a shell with substitute user and group IDs. In other words, it enables the users to work as someone else:
su
and I magically become root (but I'll have to enter the root's password).
su -l matthew
And I start working as matthew:
[paul@Thalesians ~]$ su -l matthew Password: [matthew@Thalesians ~]$
until I do
[matthew@Thalesians ~]$ exit [paul@Thalesians ~]$
sudo also offers a lot of power. I can remain paul while issuing root commands. All I need to do is prefix them with sudo.
E.g. I would need to be logged in as a root to do this:
/usr/sbin/userdel -r nikolai
However, I could remain paul and do
sudo /usr/sbin/userdel -r nikolai
But that's provided that I'm allowed, i.e. that I'm a sudoer. Not everyone is a sudoer. Also, some sudoers will have to enter the root's password anyway, others won't. This is all determined by the sudoers file, which you can edit safely with
sudo /usr/sbin/visudo
Adding a password to Apache
sudo htpasswd -b /etc/httpd/passwd/passwords jayson xxxxxxxx
You will see
Adding password for user jayson
Killing a process
There is more than one way to kill a process.
Some users are accustomed to
kill -9 pid
where pid is the process ID. However, this may not be a great idea. Here is why.
kill sends a signal to a process. In *nix there are many different types of signals that are sent to processes. They serve various pusposes. The process may choose to ignore the signal or trap it and execute a signal handler. Untrapped signals have a default action which may be "do nothing" or "exit". There are two signals that are untrappable: SIGKILL and SIGSTOP.
You can examine the list of all the signals by issuing
kill -l
This will show you something like
HUP INT QUIT ILL TRAP ABRT BUS FPE KILL USR1 SEGV USR2 PIPE ALRM TERM STKFLT CHLD CONT STOP TSTP TTIN TTOU URG XCPU XFSZ VTALRM PROF WINCH POLL PWR SYS RTMIN RTMIN+1 RTMIN+2 RTMIN+3 RTMAX-3 RTMAX-2 RTMAX-1 RTMAX
The most commonly used signals are:
- SIGTERM (15, terminate)
- SIGINT (2, interrupt)
- SIGKILL (9, kill &mdash untrappable!)
Most programs perform various cleanup steps on exit (e.g. delete temporary files, shut down sockets, remove shared memory segments, close open files). They typically trap SIGTERM and SIGINT to do their cleanup duties. However, they cannot trap SIGKILL.
kill -9 on MySQL will corrupt your database. Similarly, your sockets probably won't be properly shut down.
To avoid data corruption, you should follow the following routine to shut down a process:
Check that the process is running and get its PID:
ps aux | grep processname
where processname is the name of the process.
kill pid
where pid is will send SIGTERM. Wait 5 seconds.
kill pid
try again. Wait 5 seconds.
kill -INT pid
will send SIGINT. Wait 5 seconds.
kill -INT pid
try again. Wait 5 seconds.
kill -KILL pid
(this is the same as kill -9 pid). If this doesn't work, something is very, very wrong. You may try
kill -KILL pid
again.
Acknowledgement: This advice is based on http://speculation.org/garrick/kill-9.html by Garrick Staples.
Colours in the shell prompt
It is possible to use colours in your shell prompt.
Here is a quick recipe. For more information, please see http://tldp.org/HOWTO/Bash-Prompt-HOWTO/x405.html and http://www.cyberciti.biz/faq/bash-shell-change-the-color-of-my-shell-prompt-under-linux-or-unix/
echo "* Configuring shell prompt" BLUE=`tput setf 1` GREEN=`tput setf 2` CYAN=`tput setf 3` RED=`tput setf 4` MAGENTA=`tput setf 5` YELLOW=`tput setf 6` WHITE=`tput setf 7` PS1="\[$MAGENTA\]\@ \[$RED\]\u@\h \[$CYAN\]\w \[$YELLOW\]\! \[$WHITE\]$ "
Your shell prompt will look like this:
- 09:36 AM paul@host01 ~/dev 963 $ emacs