combining gunzip and tar commands in Solaris and AIX - unix

I am running the below command to untar a file in Solaris and AIX:
# gunzip /opt/myfile.tar.gz | tar -xvf-
but I'm getting this error:
tar: Unexpected end-of-file while reading from the storage media.
What do I need to fix?

Why should this work? The default behaviour of gunzip unpacks the file in place, substitutes the packed file with the unpacked one and you didn't specified the nescessary command to put the uncompressed datastream to stdout. So the tar command doesn't receive anything through the pipe to process and so you get the errormessage you have seen.
This will work:
gunzip -c ../myfile.tar.gz | tar -xfv -
This command line was tested on a Solaris 11.3 ... older variants of Solaris may need a different sorting of the command line like
gunzip -c ../myfile.tar.gz | tar -xvf -

I think something like this should work but I I don't have a Solaris system to test it...
gzip -dc /opt/myfile.tar.gz | tar xvf -

Related

pdftk, copying files without taking comments and annotations

I have many PDF files which contain comments and annotations made with Adobe Acrobat Reader. However, it will take many hours to copy these files with the comment being deleted manually.
Does PDFtk provide commands to copy files without taking comments and annotations?
You can do this with:
cpdf -remove-annotations in.pdf -o out.pdf
One helpful solution is:
$ LC_CTYPE=C && LANG=C
$ pdftk in.pdf output - uncompress | sed '/^\/Annots/d' | pdftk - output out.pdf compress
The out.pdf has no comments and annotations.
Use bash to process on macOS:
LC_CTYPE=C && LANG=C
paperList=papers.txt
rm ${paperList}
ls | cat > ${paperList}
saveDir=../temp_without_annon
mkdir -p ${saveDir}
130 ↵
while IFS= read -r line
do
pdftk ${line} output - uncompress | sed '/^\/Annots/d' | pdftk - output ${saveDir}/${line} compress;
done < ${paperList}
References
How to install pdftk on Mac OS X
https://stackoverflow.com/a/49614525/5046896

Input output redirection UNIX

Say I have a file called package.tar.gz
Then I do:
cat package.tar.gz | gzip -d | tar tvf -
and it shows me the list of files in my tar archive.
However if I do:
gzip -d package.tar.gz | tar tvf -
It says tar: This does not look like a tar archive
I don't understand why that is. If the result of gzip -d in the first case returns output which can be interpreted as a tar archive, why won't it work in the second case?
I have seen Autotools - tar This does not look like a tar archive but I'm not convinced that it's an issue with tar in my case since the first command works...
It looks to me like you're not passing the -d option in the second case. from the manpage,
Compressed files can be restored to their original form using gzip
-d or gunzip or zcat.
What's probably most appropriate for that style is zcat which is just what it sounds like - gunzip + cat.
The GNU tar will directly decompress the file:
tar -xf package.tar.gz
It automatically detects which decompressor to use (gzip, bzip2, xz, lzip, etc).
If your tar won't handle the decompressions, then gzip -cd decrypts to standard output:
gzip -cd package.tar.gz | tar -xf -
The -c option means read from standard input or write to standard output (in this case, write); the -d option means decrypt. You could also use gunzip -c in place of gzip -cd. This is 'standard' behaviour for compression programs.
Is this what you want to do?
gunzip -c package.tar.gz | tar xvf -
Or
gzip -cd package.tar.gz | tar xvf -
Basically,
gzip -d package.tar.gz
will not output to standard out, which
tar tvf -
expects. The result of
gzip -d package.tar.gz
is that the file is unzipped as a side effect. Need to use
gzip -dc package.tar.gz | tar tvf -
to get the desired effect.

Installing Pear, what did I do by entering these commands on my terminal

I'm trying to figure out how to install Pear on my Mac (10.6.6).
Not understanding what they're telling me at pear.php.net, I got some code from http://clickontyler.com/blog/2008/01/how-to-install-pear-in-mac-os-x-leopard/
First, I entered curl http://pear.php.net/go-pear > go-pear.php in my terminal.
It resulted in this output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 88004 100 88004 0 0 47537 0 0:00:01 0:00:01 --:--:-- 59744
What does that all mean? Am I on the right track?
Next, I entered sudo php -q go-pear.php
and it gave me the long output below. In short I have no idea where I am in the installation process. However, I'm pretty sure that I'm not where I'm supposed to be at following the tutorial at http://clickontyler.com/blog/2008/01/how-to-install-pear-in-mac-os-x-leopard/
because the tutorial tells me to select all the default choices, and I don't see any options to select.
The next line of code is asking me to modify the php.ini files and it requires a password so I'm worried about doing it...Can anyone tell me if I'm on the right track?
sudo cp /etc/php.ini.default /etc/php.ini
Usage: php [options] [-f] <file> [--] [args...]
php [options] -r <code> [--] [args...]
php [options] [-B <begin_code>] -R <code> [-E <end_code>] [--] [args...]
php [options] [-B <begin_code>] -F <file> [-E <end_code>] [--] [args...]
php [options] -- [args...]
php [options] -a
-a Run interactively
-c <path>|<file> Look for php.ini file in this directory
-n No php.ini file will be used
-d foo[=bar] Define INI entry foo with value 'bar'
-e Generate extended information for debugger/profiler
-f <file> Parse and execute <file>.
-h This help
-i PHP information
-l Syntax check only (lint)
-m Show compiled in modules
-r <code> Run PHP <code> without using script tags <?..?>
-B <begin_code> Run PHP <begin_code> before processing input lines
-R <code> Run PHP <code> for every input line
-F <file> Parse and execute <file> for every input line
-E <end_code> Run PHP <end_code> after processing all input lines
-H Hide any passed arguments from external tools.
-s Output HTML syntax highlighted source.
-v Version number
-w Output source with stripped comments and whitespace.
-z <file> Load Zend extension <file>.
args... Arguments passed to script. Use -- args when first argument
starts with - or script is read from stdin
--ini Show configuration file names
--rf <name> Show information about function <name>.
--rc <name> Show information about class <name>.
--re <name> Show information about extension <name>.
--ri <name> Show configuration for extension <name>.
php does not have an argument -q. Its also mentioned in go-pear.php (http://pear.php.net/go-pear) itself, but I dont know, what it wants to tell me. However, try
sudo php go-pear.php
and then follow the instructions.
Update:
-q was used, to start the interpreter in "quiet" mode. It seems, that this option does not exists anymore, because php always starts "quiet", but it should not cause an error, anyway. Now make sure you are in the same directory as the file go-pear.php before you call php go-pear.php.
The first part shows that you successfully downloaded the file to go-pear.php.
The second part is showing that -q isn't a valid option. The third part is asking for the root password, since you're doing 'sudo'.
I used this, though I wasn't installing on Mac:
Getting and installing the PEAR package manager

Can PowerShell (or script) on Windows / Mac / Ubuntu list file / directory structure easily?

Can PowerShell on Windows by itself or using simple shell script, list files and directory this way: (or using Mac OS X or Ubuntu's shell script)
audio
mp3
song1.mp3
some other song.mp3
audio books
7 habits.mp3
video
samples
up.mov
cars.mov
Unix's ls -R or ls -lR can't seem to list it in a tree structure unfortunately.
You can use tree.com for listing like indented like shown above. Note that tree.com only works with the filesystem. If you ever have a need to display structure for other providers like WSMan or RegEdit, you can use the Show-Tree function that comes with the PowerShell Community Extensions.
In Linux, you can use:
ls -R directory | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/'
or for the current directory:
ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/'
You can put this "small" command in a script: look here
you can use Unix's tree command, or if you are on Windows, the GNU windows tree.
Windows has a tree command:
C:\folder>tree . /F
Folder PATH listing for volume sys
Volume serial number is F275-CBCA
C:\FOLDER.
│ file01.txt
│
├───Sub folder
│ chart-0001.png
│ chart-0002.png
└───────chart-0004.png
The /F parameter is what tells it to show files. You can execute this from Powershell
This is probably what you're looking for:
ls -R | tree
It's not installed by default on Ubuntu. So, to install it:
sudo apt-get install tree

Performing grep operation in tar files without extracting

I have list of files which contain particular patterns, but those files have been tarred. Now I want to search for the pattern in the tar file, and to know which files contain the pattern without extracting the files.
Any idea...?
the tar command has a -O switch to extract your files to standard output. So you can pipe those output to grep/awk
tar xvf test.tar -O | awk '/pattern/{print}'
tar xvf test.tar -O | grep "pattern"
eg to return file name one pattern found
tar tf myarchive.tar | while read -r FILE
do
if tar xf test.tar $FILE -O | grep "pattern" ;then
echo "found pattern in : $FILE"
fi
done
The command zgrep should do exactly what you want, directly.
for example
zgrep "mypattern" *.gz
http://linux.about.com/library/cmd/blcmdl1_zgrep.htm
GNU tar has --to-command. With it you can have tar pipe each file from the archive into the given command. For the case where you just want the lines that match, that command can be a simple grep. To know the filenames you need to take advantage of tar setting certain variables in the command's environment; for example,
tar xaf thing.tar.xz --to-command="awk -e '/thing.to.match/ {print ENVIRON[\"TAR_FILENAME\"] \":\", \$0}'"
Because I find myself using this often, I have this:
#!/bin/sh
set -eu
if [ $# -lt 2 ]; then
echo "Usage: $(basename "$0") <pattern> <tarfile>"
exit 1
fi
if [ -t 1 ]; then
h="$(tput setf 4)"
m="$(tput setf 5)"
f="$(tput sgr0)"
else
h=""
m=""
f=""
fi
tar xaf "$2" --to-command="awk -e '/$1/{gsub(\"$1\", \"$m&$f\"); print \"$h\" ENVIRON[\"TAR_FILENAME\"] \"$f:\", \$0}'"
This can be done with tar --to-command and grep --label:
tar xaf archive.tar.gz --to-command 'egrep -Hn --label="$TAR_FILENAME" your_pattern_here || true'
--label gives grep the filename
-H tells grep to display the filename, and -n the line number
|| true because otherwise grep will exit with an error if the pattern is not found, and tar will complain about that.
xaf means to extract, and automagically decompress based off the file extension
--to-command has tar pass each file in the tarfile to a separate invocation of grep, and sets various environment variables with info about the file. See the manpage for more info.
Pretty heavily based off of Chipaca's answer (and Daniel H's comment), but this should be a bit easier to use and just uses tar and grep.
Python's tarfile module along with Tarfile.extractfile() will allow you to inspect the tarball's contents without extracting it to disk.
The easiest way is probably to use avfs. I've used this before for such tasks.
Basically, the syntax is:
avfsd ~/.avfs # Sets up a avfs virtual filesystem
rgrep pattern ~/.avfs/path/to/file.tar#/
/path/to/file.tar is the path to the actual tar file.
Pre-pending ~/.avfs/ (the mount point) and appending # lets avfs expose the tar file as a directory.
That's actually very easy with ugrep option -z:
-z, --decompress
Decompress files to search, when compressed. Archives (.cpio,
.pax, .tar, and .zip) and compressed archives (e.g. .taz, .tgz,
.tpz, .tbz, .tbz2, .tb2, .tz2, .tlz, and .txz) are searched and
matching pathnames of files in archives are output in braces. If
-g, -O, -M, or -t is specified, searches files within archives
whose name matches globs, matches file name extensions, matches
file signature magic bytes, or matches file types, respectively.
Supported compression formats: gzip (.gz), compress (.Z), zip,
bzip2 (requires suffix .bz, .bz2, .bzip2, .tbz, .tbz2, .tb2, .tz2),
lzma and xz (requires suffix .lzma, .tlz, .xz, .txz).
For example:
ugrep -z PATTERN archive.tgz
This greps each of the archived files to display PATTERN matches with the archived filenames. Archived filenames are shown in braces to distinguish them from ordinary filenames. Everything else is the same as grep (ugrep has the same options and produces the same output). For example:
$ ugrep -z "Hello" archive.tgz
{Hello.bat}:echo "Hello World!"
Binary file archive.tgz{Hello.class} matches
{Hello.java}:public class Hello // prints a Hello World! greeting
{Hello.java}: { System.out.println("Hello World!");
{Hello.pdf}:(Hello)
{Hello.sh}:echo "Hello World!"
{Hello.txt}:Hello
If you just want the file names, use option -l (--files-with-matches) and customize the filename output with option --format="%z%~" to get rid of the braces:
$ ugrep -z Hello -l --format="%z%~" archive.tgz
Hello.bat
Hello.class
Hello.java
Hello.pdf
Hello.sh
Hello.txt
Tarballs (.tar.gz/.tgz, .tar.bz2/.tbz, .tar.xz/.txz, .tar.lzma/.tlz) are searched as well as .zip archives.
You can mount the TAR archive with ratarmount and then simply search for the pattern in the mounted view:
pip install --user ratarmount
ratarmount large-archive.tar mountpoint
grep -r '<pattern>' mountpoint/
This should be much faster than iterating over each file and printing it to stdout, especially for compressed TARs.
Here is a simple comparison benchmark:
function checkFilesWithRatarmount()
{
local pattern=$1
local archive=$2
ratarmount "$archive" "$archive.mountpoint"
'grep' -r -l "$pattern" "$archive.mountpoint/"
}
function checkEachFileViaStdOut()
{
local pattern=$1
local archive=$2
tar --list --file "$archive" | while read -r file; do
if tar -x --file "$archive" -O -- "$file" | grep -q "$pattern"; then
echo "Found pattern in: $file"
fi
done
}
function createSampleTar()
{
for i in $( seq 40 ); do
head -c $(( 1024 * 1024 )) /dev/urandom | base64 > $i.dat
done
tar -czf "$1" [0-9]*.dat
}
createSampleTar myarchive.tar.gz
time checkEachFileViaStdOut ABCD myarchive.tar.gz
time checkFilesWithRatarmount ABCD myarchive.tar.gz
sleep 0.5s
fusermount -u myarchive.tar.gz.mountpoint
Results in seconds for a 55 MiB uncompressed and 42 MiB compressed TAR archive containing 40 files:
Compression
Ratarmount
Bash Loop over tar -O
none
0.31 +- 0.01
0.55 +- 0.02
gzip
1.1 +- 0.1
13.5 +- 0.1
bzip2
1.2 +- 0.1
97.8 +- 0.2
Of course, these results are highly dependent on the archive size and how many files the archive contains. These test examples are pretty small because I didn't want to wait too long but they already show the problem. The more files there are, the longer it takes for tar -O to jump to the correct file. And for compressed archives, it will be quadratically slower the larger the archive size is because everything before the requested file has to be decompressed and each file is requested separately. Both of these problems are solved by ratarmount.

Resources