GNU Make: delegate target creation to external Makefile - gnu-make

I am using Makefiles in a project to automate workflows. In order to keep them readable, I have one Makefile in each subdirectory. The structure is like the following:
+ a
|- Makefile # with input.txt -> output.txt
|- input.txt
|- output.txt
+ b
|- Makefile # with input.txt -> output.txt
|- input.txt
Now, there is a complication: The rule that converts input.txt to output.txt in b also needs output.txt from a.
I would like to tell my b/Makefile to use a/Makefile to build a/output.txt. Is there any way to do this?
Here is what I tried, and which issues arose:
Add a rule in b/Makefile that does make -C ../a output.txt: I can't have a pattern rule using %.txt: ../a/%.txt in b/Makefile because otherwise it uses this rule to create a recursive definition mess
Mark a/output.txt as .PHONY in b/Makefile: the target in b is rebuilt every time (it should only if a rebuilds it)

In a/Makefile:
output.txt: input.txt ../b/output.txt
cp $< $#
../b/%:
cd ../b/ && $(MAKE) $*
In b/Makefile:
output.txt: input.txt ../a/output.txt
cp $< $#
../a/%:
cd ../a/ && $(MAKE) $*
That being said, who don't you have a Makefile at the top level (.. in this example) that just handles the whole project properly? This would make it a lot more robust, keep you from repeating yourself, and allow parallel execution and other fancy tricks that makes provides.

Related

How to get non-recursive list of directory contents with their paths in Makefile?

My project's directory structure is as follows:
My Makefile looks like this:
dir1_contents = $(shell ls dir1/*)
dir3_contents = $(shell ls dir3/*)
all: clean_dir1 clean_dir3
clean_dir1:
echo 'dir1_contents = $(dir1_contents)'
clean_dir3:
echo 'dir3_contents = $(dir3_contents)'
When I run make, this is what I get:
$ pwd
make-test
$ make -s
dir1_contents = dir1/file2.junk dir1/dir2: file1.junk
dir3_contents = dir3/file3.junk
I want to get the contents of dir1 in dir1_contents. But I don't want the recursive contents. Just the contents that are immediately below the dir1 directory. How can I do it?
If I remove the /* from the first two lines of the Makefile, I get the right contents I want. But then they are missing their paths which I also need:
$ pwd
make-test
$ make -s
dir1_contents = dir2 file2.junk
dir3_contents = file3.junk
How can I get the right contents I want with the file paths that I need also?
The problem you're having is related to the way ls works, it's nothing to do with GNU make. If you run ls dir1/* then the shell expands the wildcard before invoking ls, so this is the same as running:
ls dir1/dir2 dir1/file2.junk
And when you ls a directory it shows the contents of the directory by default, so the output is:
$ ls dir1/dir2 dir1/file2.junk
dir1/file2.junk
dir1/dir2:
file1.junk
(go ahead and try this at the command prompt). Since that's what the shell prints, that's the result you get back.
If you just want to see the directories but not the files in the directories, you can add the -d option to the ls command:
$ ls -d dir1/dir2 dir1/file2.junk
dir1/file2.junk dir1/dir2
Or, in the makefile:
dir1_contents = $(shell ls -d dir1/*)
dir3_contents = $(shell ls -d dir3/*)
However, I think this is not a great way to do it anyway. Why not use GNU make's built-in wildcard function instead? In addition to being simpler to understand it's a LOT more portable and much more efficient (the above version requires invoking a shell and the ls program). I also recommend you use := not =, so that you only perform the wildcard operation one time:
dir1_contents := $(wildcard dir1/*)
dir3_contents := $(wildcard dir3/*)

How to create directories for dist files?

Here's my Makefile:
dist/%.js: src/%.js node_modules
$(NM)/babel $< -o $#
build: $(patsubst src/%,dist/%,$(wildcard src/**/*.js))
It runs a command like this:
node_modules/.bin/babel src/deep/foo.js -o dist/deep/foo.js
The problem is that if dist/deep doesn't exist, it errors:
Error: ENOENT: no such file or directory, open 'dist/deep/foo.js'
So what I want to do is add an extra dependency on the directory, which I was hoping I could do with something like this:
dist/%.js: src/%.js $(dir dist/%) node_modules
$(NM)/babel $< -o $#
dist/%/:
mkdir -p $#
build: $(patsubst src/%,dist/%,$(wildcard src/**/*.js))
But it doesn't work. $(dir dist/%) isn't filling in that % like I hoped. Running make --trace yields:
Makefile:10: update target 'dist/deep/foo.js' due to: src/deep/foo.js dist/ node_modules
i.e., you can see it has a dependency on dist/, but I was hoping it'd depend on dist/deep/ so that I could mkdir -p it.
Is there a way to achieve what I want?
First a subsidiary snag. Judging from:
$(wildcard src/**/*.js)
it seems you want this function to perform recursive globbing,
returning all *.js files that exist in src or any subdirectory
thereof.
I don't know what shell you've got, but they don't all do that by
default. The linux bash shell doesn't, though as of bash 4.0
it will do it if the shell option globstar is set.
And anyway, $(wildcard ...) won't do it (unless, possibly, the
operative shell does it by default, which I'm not in a position to
check out). So you can't dependably use $(wildcard ...) for that
purpose. You need make to be invoking a shell in which recursive
** globbing is enabled, and then call:
$(shell ls src/**/*.js)
So that's what I'll do now in showing how to solve your problem with
a simple example. I've got:
src/
one.js
a/
two.js
c/
four.js
b/
three.js
and I just want to each *.js file copied from beneath src to the
same relative name under dist, ensuring that dist and all
necessary subdirectories exist when required to. (Of course, this
could all be done at once with cp). Here is a makefile:
SHELL := /bin/bash
.SHELLFLAGS := -O globstar -c
SRCS := $(shell ls src/**/*.js)
DISTS := $(patsubst src/%,dist/%,$(SRCS))
DESTDIRS := $(dir $(DISTS))
.PHONY: all clean
all: $(DISTS)
dist/%.js: src/%.js | $(DESTDIRS)
cp $< $#
$(DESTDIRS):
mkdir -p $#
clean:
rm -fr dist
which runs like:
$ make
mkdir -p dist/a/c/
mkdir -p dist/b/
cp src/a/c/four.js dist/a/c/four.js
cp src/a/two.js dist/a/two.js
cp src/b/three.js dist/b/three.js
cp src/one.js dist/one.js
In that makefile,
| $(DESTDIRS)
makes each of the $(DESTDIRS) an order-only prerequisite
of any target dist/%.js. An order-only prequisite is not considered in determining whether its
target shall be made, but if it is determined that the target shall be made, then the
order-only prequisite will be made first.

Recursive javap and save results to files with the same name

I want to decompile .class files in many directories and then save output of every file to file with the same name (of course with different extension). I tried to set classpath, but I receive some errors that one of directories wasn't found, but it's nonsense so I think that I am doing someting wrong. (javap -classpath path/to/files/ -c *).
I want to do it using javap, I don't want to use libraries, programs, etc. Greets.
This is the solution:
javap -classpath yourjar.jar -c $(jar -tf yourjar.jar | grep class | sed 's/.class//g')
To save to separated files:
for i in $(jar -tf yourjar.jar | grep class | sed 's/.class//g') ; do mkdir -p $(dirname $i) ; javap -cp yourjar.jar -c $i > $i.javap ; done

Performing grep operation in tar files without extracting

I have list of files which contain particular patterns, but those files have been tarred. Now I want to search for the pattern in the tar file, and to know which files contain the pattern without extracting the files.
Any idea...?
the tar command has a -O switch to extract your files to standard output. So you can pipe those output to grep/awk
tar xvf test.tar -O | awk '/pattern/{print}'
tar xvf test.tar -O | grep "pattern"
eg to return file name one pattern found
tar tf myarchive.tar | while read -r FILE
do
if tar xf test.tar $FILE -O | grep "pattern" ;then
echo "found pattern in : $FILE"
fi
done
The command zgrep should do exactly what you want, directly.
for example
zgrep "mypattern" *.gz
http://linux.about.com/library/cmd/blcmdl1_zgrep.htm
GNU tar has --to-command. With it you can have tar pipe each file from the archive into the given command. For the case where you just want the lines that match, that command can be a simple grep. To know the filenames you need to take advantage of tar setting certain variables in the command's environment; for example,
tar xaf thing.tar.xz --to-command="awk -e '/thing.to.match/ {print ENVIRON[\"TAR_FILENAME\"] \":\", \$0}'"
Because I find myself using this often, I have this:
#!/bin/sh
set -eu
if [ $# -lt 2 ]; then
echo "Usage: $(basename "$0") <pattern> <tarfile>"
exit 1
fi
if [ -t 1 ]; then
h="$(tput setf 4)"
m="$(tput setf 5)"
f="$(tput sgr0)"
else
h=""
m=""
f=""
fi
tar xaf "$2" --to-command="awk -e '/$1/{gsub(\"$1\", \"$m&$f\"); print \"$h\" ENVIRON[\"TAR_FILENAME\"] \"$f:\", \$0}'"
This can be done with tar --to-command and grep --label:
tar xaf archive.tar.gz --to-command 'egrep -Hn --label="$TAR_FILENAME" your_pattern_here || true'
--label gives grep the filename
-H tells grep to display the filename, and -n the line number
|| true because otherwise grep will exit with an error if the pattern is not found, and tar will complain about that.
xaf means to extract, and automagically decompress based off the file extension
--to-command has tar pass each file in the tarfile to a separate invocation of grep, and sets various environment variables with info about the file. See the manpage for more info.
Pretty heavily based off of Chipaca's answer (and Daniel H's comment), but this should be a bit easier to use and just uses tar and grep.
Python's tarfile module along with Tarfile.extractfile() will allow you to inspect the tarball's contents without extracting it to disk.
The easiest way is probably to use avfs. I've used this before for such tasks.
Basically, the syntax is:
avfsd ~/.avfs # Sets up a avfs virtual filesystem
rgrep pattern ~/.avfs/path/to/file.tar#/
/path/to/file.tar is the path to the actual tar file.
Pre-pending ~/.avfs/ (the mount point) and appending # lets avfs expose the tar file as a directory.
That's actually very easy with ugrep option -z:
-z, --decompress
Decompress files to search, when compressed. Archives (.cpio,
.pax, .tar, and .zip) and compressed archives (e.g. .taz, .tgz,
.tpz, .tbz, .tbz2, .tb2, .tz2, .tlz, and .txz) are searched and
matching pathnames of files in archives are output in braces. If
-g, -O, -M, or -t is specified, searches files within archives
whose name matches globs, matches file name extensions, matches
file signature magic bytes, or matches file types, respectively.
Supported compression formats: gzip (.gz), compress (.Z), zip,
bzip2 (requires suffix .bz, .bz2, .bzip2, .tbz, .tbz2, .tb2, .tz2),
lzma and xz (requires suffix .lzma, .tlz, .xz, .txz).
For example:
ugrep -z PATTERN archive.tgz
This greps each of the archived files to display PATTERN matches with the archived filenames. Archived filenames are shown in braces to distinguish them from ordinary filenames. Everything else is the same as grep (ugrep has the same options and produces the same output). For example:
$ ugrep -z "Hello" archive.tgz
{Hello.bat}:echo "Hello World!"
Binary file archive.tgz{Hello.class} matches
{Hello.java}:public class Hello // prints a Hello World! greeting
{Hello.java}: { System.out.println("Hello World!");
{Hello.pdf}:(Hello)
{Hello.sh}:echo "Hello World!"
{Hello.txt}:Hello
If you just want the file names, use option -l (--files-with-matches) and customize the filename output with option --format="%z%~" to get rid of the braces:
$ ugrep -z Hello -l --format="%z%~" archive.tgz
Hello.bat
Hello.class
Hello.java
Hello.pdf
Hello.sh
Hello.txt
Tarballs (.tar.gz/.tgz, .tar.bz2/.tbz, .tar.xz/.txz, .tar.lzma/.tlz) are searched as well as .zip archives.
You can mount the TAR archive with ratarmount and then simply search for the pattern in the mounted view:
pip install --user ratarmount
ratarmount large-archive.tar mountpoint
grep -r '<pattern>' mountpoint/
This should be much faster than iterating over each file and printing it to stdout, especially for compressed TARs.
Here is a simple comparison benchmark:
function checkFilesWithRatarmount()
{
local pattern=$1
local archive=$2
ratarmount "$archive" "$archive.mountpoint"
'grep' -r -l "$pattern" "$archive.mountpoint/"
}
function checkEachFileViaStdOut()
{
local pattern=$1
local archive=$2
tar --list --file "$archive" | while read -r file; do
if tar -x --file "$archive" -O -- "$file" | grep -q "$pattern"; then
echo "Found pattern in: $file"
fi
done
}
function createSampleTar()
{
for i in $( seq 40 ); do
head -c $(( 1024 * 1024 )) /dev/urandom | base64 > $i.dat
done
tar -czf "$1" [0-9]*.dat
}
createSampleTar myarchive.tar.gz
time checkEachFileViaStdOut ABCD myarchive.tar.gz
time checkFilesWithRatarmount ABCD myarchive.tar.gz
sleep 0.5s
fusermount -u myarchive.tar.gz.mountpoint
Results in seconds for a 55 MiB uncompressed and 42 MiB compressed TAR archive containing 40 files:
Compression
Ratarmount
Bash Loop over tar -O
none
0.31 +- 0.01
0.55 +- 0.02
gzip
1.1 +- 0.1
13.5 +- 0.1
bzip2
1.2 +- 0.1
97.8 +- 0.2
Of course, these results are highly dependent on the archive size and how many files the archive contains. These test examples are pretty small because I didn't want to wait too long but they already show the problem. The more files there are, the longer it takes for tar -O to jump to the correct file. And for compressed archives, it will be quadratically slower the larger the archive size is because everything before the requested file has to be decompressed and each file is requested separately. Both of these problems are solved by ratarmount.

Unix shell file copy flattening folder structure

On the UNIX bash shell (specifically Mac OS X Leopard) what would be the simplest way to copy every file having a specific extension from a folder hierarchy (including subdirectories) to the same destination folder (without subfolders)?
Obviously there is the problem of having duplicates in the source hierarchy. I wouldn't mind if they are overwritten.
Example: I need to copy every .txt file in the following hierarchy
/foo/a.txt
/foo/x.jpg
/foo/bar/a.txt
/foo/bar/c.jpg
/foo/bar/b.txt
To a folder named 'dest' and get:
/dest/a.txt
/dest/b.txt
In bash:
find /foo -iname '*.txt' -exec cp \{\} /dest/ \;
find will find all the files under the path /foo matching the wildcard *.txt, case insensitively (That's what -iname means). For each file, find will execute cp {} /dest/, with the found file in place of {}.
The only problem with Magnus' solution is that it forks off a new "cp" process for every file, which is not terribly efficient especially if there is a large number of files.
On Linux (or other systems with GNU coreutils) you can do:
find . -name "*.xml" -print0 | xargs -0 echo cp -t a
(The -0 allows it to work when your filenames have weird characters -- like spaces -- in them.)
Unfortunately I think Macs come with BSD-style tools. Anyone know a "standard" equivalent to the "-t" switch?
The answers above don't allow for name collisions as the asker didn't mind files being over-written.
I do mind files being over-written so came up with a different approach. Replacing each / in the path with - keep the hierarchy in the names, and puts all the files in one flat folder.
We use find to get the list of all files, then awk to create a mv command with the original filename and the modified filename then pass those to bash to be executed.
find ./from -type f | awk '{ str=$0; sub(/\.\//, "", str); gsub(/\//, "-", str); print "mv " $0 " ./to/" str }' | bash
where ./from and ./to are directories to mv from and to.
If you really want to run just one command, why not cons one up and run it? Like so:
$ find /foo -name '*.txt' | xargs echo | sed -e 's/^/cp /' -e 's|$| /dest|' | bash -sx
But that won't matter too much performance-wise unless you do this a lot or have a ton of files. Be careful of name collusions, however. I noticed in testing that GNU cp at least warns of collisions:
cp: will not overwrite just-created `/dest/tubguide.tex' with `./texmf/tex/plain/tugboat/tubguide.tex'
I think the cleanest is:
$ find /foo -name '*.txt' | xargs -i cp {} /dest
Less syntax to remember than the -exec option.
As far as the man page for cp on a FreeBSD box goes, there's no need for a -t switch. cp will assume the last argument on the command line to be the target directory if more than two names are passed.

Resources