Tag Archives: bzip2

Cat a number of files and want to know which file it is working on?

In the following example, I have 21 files that came from a raw partition that I split at 10GB intervals. I am piping that to parallel bzip2 (pbzip2) and writing it to a raw partition (logical volume).

cat /mnt/DBADEV1/DBADEV1.disk.bz2.00 /mnt/DBADEV1/DBADEV1.disk.bz2.01 \
/mnt/DBADEV1/DBADEV1.disk.bz2.02 /mnt/DBADEV1/DBADEV1.disk.bz2.03 \
/mnt/DBADEV1/DBADEV1.disk.bz2.04 /mnt/DBADEV1/DBADEV1.disk.bz2.05 \
/mnt/DBADEV1/DBADEV1.disk.bz2.06 /mnt/DBADEV1/DBADEV1.disk.bz2.07 \
/mnt/DBADEV1/DBADEV1.disk.bz2.08 /mnt/DBADEV1/DBADEV1.disk.bz2.09 \
/mnt/DBADEV1/DBADEV1.disk.bz2.10 /mnt/DBADEV1/DBADEV1.disk.bz2.11 \
/mnt/DBADEV1/DBADEV1.disk.bz2.12 /mnt/DBADEV1/DBADEV1.disk.bz2.13 \
/mnt/DBADEV1/DBADEV1.disk.bz2.14 /mnt/DBADEV1/DBADEV1.disk.bz2.15 \
/mnt/DBADEV1/DBADEV1.disk.bz2.16 /mnt/DBADEV1/DBADEV1.disk.bz2.17 \
/mnt/DBADEV1/DBADEV1.disk.bz2.18 /mnt/DBADEV1/DBADEV1.disk.bz2.19 \
/mnt/DBADEV1/DBADEV1.disk.bz2.20 | pbzip2 -dcv -p4 > /dev/mapper/VG_VMH1-LV_DBADEV1

Output:

Parallel BZIP2 v1.0.5 - by: Jeff Gilchrist [http://compression.ca]
[Jan. 08, 2009]  (uses libbzip2 by Julian Seward)

         # CPUs: 4
-------------------------------------------
         File #: 1 of 1
     Input Name: 
    Output Name: 

 BWT Block Size: 900k
Decompressing data (no threads)...

This will take a while, so let’s determine which file it is currently working on:

lsof|grep DBADEV1
cat 24137 root 3r REG 8,1 10737418240 80 /mnt/DBADEV1/DBADEV1.disk.bz2.07

Oh boy, it’s only on file #8. oh well, we can watch it a little easier with the “watch” command set to run the lsof command every 5 seconds:

watch -n5 'lsof|grep DBADEV1'
Share Button

Splitting a GNU tar archive across multiple files

Create tar archive files of no larger than 31 GBytes:

tar -cv -M -L 32505856 -f backup.tar ~jfroebe
  • -c create tar archive
  • -v verbose output
  • -M enable multi-volume handling (multple tapes or files)
  • -L size of file in kilobytes
  • -f name of first tar archive file
  • ~jfroebe directory that I wish to backup
  • when 31 GBytes is reached, it will ask you to insert the next file
  • n backup.tar2 <ENTER>Y<ENTER>
  • n next ‘word’ is the name of the new file (or tape drive)
  • Y<ENTER> tar will ask you to change the file ‘backup.tar2′ and confirm tar is to proceed. Since it is a new file, go right ahead and tell it to proceed

Restore the contents of a multi-volume tar file:

tar -xv -M -f backup.tar -f backup.tar2
  • tar recognizes multiple files in the restore with the only criteria being that they are in order. Meaning, tar won’t be able to restore all of the data if you do:
tar -xv -M -f backup.tar2 -f backup.tar

You can also do this with compression:

  • Example:
  • tar -cv -z -M -L 32505856 -f backup.tar.gz ~jfroebe
    tar -xv -z -M -f backup.tar.gz -f backup.tar2.gz
  • There are three methods of compressing with the GNU tar, each requires that the programs are installed
    • bzip2
    • -j parameter
    • bzip2 software package (bzip2)
  • zlib
    • -z parameter
    • gzip software package (gzip/gunzip)
  • compress
    • -Z parameter
    • ncompress software package (compress/uncompress)
  • While bzip2 typically provides the best compression, gzip is far more common in corporate environments.
  • ‘compress’ provides the worst compression and is the slowest but is guaranteed to be on all commercial unix boxes
  • GNU Tar for windows can be obtained GNU Tar and work just the same as the unix versions. Using tar is a great way for tranferring a large file or a whole bunch of files to/from windows while not having to worry about changes in the file names that can sometimes happen with filenames having unicode and/or extended characters.

    Share Button