tar
The tar command stands for tape archive. It was
originally intended for creating backup tapes on a magnetic tape drive.
tar file
Let’s look at some text files which we would like to tar:
[linux@localhost ~]$ ls -l *.txt -rw-r--r-- 1 linux linux 17632 Jul 1 10:34 amendments.txt -rw-r--r-- 1 linux linux 4261 Jul 1 10:35 bill_of_rights.txt -rw-r--r-- 1 linux linux 27217 Jul 1 10:34 constitution.txt -rw-r--r-- 1 linux linux 9241 Jul 1 10:31 declaration.txt
You give the tar command these options:
-c, which stands for create a new file-v, verbose, which provides detailed output of
what is going on. You don’t need this option, but it is good
to use when you have a small number of files and want to see what
is happening.-f, which says that you wish to create a file as
output. If you do not give this option, the output will go to
standard output (the screen).
These options are followed by the name of the output file, and then
the names of the files you want to put in to the tar file:
[linux@localhost ~]$ tar -c -v -f documents.tar *.txt amendments.txt bill_of_rights.txt constitution.txt declaration.txt
Note: as with most Linux commands, you could combine the options to
tar -cvf documents.tar *.txt
Let’s take a look at the resulting file:
[linux@localhost ~]$ ls -l documents.tar -rw-r--r-- 1 linux linux 71680 Jul 1 10:41 documents.tar
If you add up the sizes of all the individual files, you will see that
they add up to only 58,351 bytes. The documents.tar file
is larger than that because tar has to add extra information
about the size of the file, creation date, and so forth, so that it can
extract the files later. As you can see, this overhead is significant when
combining a few small files.
tar also lets you compress the resulting file with either
compress,
gzip, or bzip2. You could do this with several
commands or with a pipe, but it is easy to do by adding an option.
| To use.. | Use option | File name ends with |
|---|---|---|
compress |
-Z |
.tar.Z |
gzip |
-z |
.tar.gz .tgz |
bzip2 |
-j |
.tar.bz2 |
Here is the result of using the three forms of compression, and the
resulting file sizes. To avoid reptitious output,
we did not use the -v option.
[linux@localhost ~]$ tar -cZf documents.tar.Z *.txt [linux@localhost ~]$ tar -czf documents.tar.gz *.txt [linux@localhost ~]$ tar -cjf documents.tar.bz2 *.txt [linux@localhost ~]$ ls -l documents.* -rw-r--r-- 1 linux linux 71680 Jul 1 10:41 documents.tar -rw-r--r-- 1 linux linux 15753 Jul 1 11:02 documents.tar.bz2 -rw-r--r-- 1 linux linux 18420 Jul 1 11:01 documents.tar.gz -rw-r--r-- 1 linux linux 24685 Jul 1 11:01 documents.tar.Z
tar file
Use the -t option to see what is inside a tar
file without having to extract the files:
[linux@localhost ~]$ tar -tvzf documents.tar.gz -rw-r--r-- linux/linux 17632 2008-07-01 10:34 amendments.txt -rw-r--r-- linux/linux 4261 2008-07-01 10:35 bill_of_rights.txt -rw-r--r-- linux/linux 27217 2008-07-01 10:34 constitution.txt -rw-r--r-- linux/linux 9241 2008-07-01 10:31 declaration.txt
tar file
Use the -x option to extract the files. Beware: tar will
overwrite existing files without asking you.
Let’s take another look at documents.tar.gz; the
.gz at the end
tells you that it was compressed using gzip. To unpack the
files, you would use this command:
[linux@localhost ~]$ tar -xvzf document.tar.gz
The options are: x (extract), v (verbose; give
me lots of output), z (use gunzip), f (a file
name follows).
What if someone sent you a file named
reports.tar.bz2? The bz2 would tell you that
the file had been compressed using bzip2, so you would have
to extract the files with a command like this:
[linux@localhost ~]$ tar -xvjf reports.tar.bz2
You can use the -k
option to keep existing files when extracting. If tar sees a file
that already exists, it will not overwrite it. Here is what happens if you use that
option:
[linux@localhost ~]$ tar -xvkzf documents.tar.gz amendments.txt tar: amendments.txt: Cannot open: File exists tar: Skipping to next header bill_of_rights.txt tar: bill_of_rights.txt: Cannot open: File exists tar: Skipping to next header etc
The -k option is “all or nothing.” Let’s say you
have this scenario:
You make a tar of files a.txt, b.txt, and
c.text:
tar -cvzf collection.tgz a.txt b.txt c.txt
You add some new material to b.text, and then later decide you want
to un-tar collection.tgz. If you don’t use any option, you will
overwrite the newer b.txt; if you use -k, none of the files
will be overwritten. The solution to this problem is the
--keep-newer-files option. It will overwrite a.txt and
c.txt, but it will not overwrite b.txt, because it is
newer than the one in the collections.tgz tar file.
--keep-newer-files option. This option will overwrite existing
files unless they are newer than the ones in the tar file.