I am trying to get the target directory for modules in a multimodule project. The challenge I have is that SBT's logging makes it hard to consume in a script.
Here is what I have at the moment:
function sbt-target {
sbt -Dsbt.log.noformat=true "project $1" 'show target' |
tail -n1 |
cut -c8-
}
I think this is very hackish as it knows about the [INFO] prefix (the cut -c8-) of each output line from SBT and about the fact that SBT's last line is the output I need (the tail -n1).
More problematic is that each invocation of sbt-target takes almost 11 seconds so invoking it for each module for a large number of modules in this project dominates the time.
How do I get the target directory in a script?
I can't speak to SBT. In terms of bash best-practices, you might consider something more akin to the following:
sbt_target() {
# declare locals as such
local line version
# iterate through all lines; later lines overwrite variable set by prior ones
while read -r line; do
version=${line#"[INFO] "} # strip undesired prefix if present
done < <(sbt -Dsbt.log.noformat=true "project $1" 'show target')
# emit result to stdout
printf '%s\n' "$version"
}
Unlike the version relying on tail and cut, this does everything in-process within bash, and is thus more efficient (presuming that sbt's show target emits a relatively small amount of output).
Related
I am trying to replicate the bash command mv `ls | head -5` ./subfolder1/ in Rust.
This is meant to move the first five files in a directory and it works fine in the shell. I am using the Command process builder but the following code fails at runtime:
Command::new("mv")
.current_dir("newdir")
.args(&["`ls | head -5`", "newdir"])
.env("PATH", "/bin")
.spawn()
Output:
mv: cannot stat cannot stat '`ls | head -5`': No such file or directory
As with pretty much all such structures, Command is a frontend to fork followed by the exec* family, meaning it executes one command, it's not a subshell, and it does not delegate to a shell.
If you want to chain multiple commands you will have to run them individually and wire them by hand, though there exist libraries to provide a shell-style interface (with the danger and inefficiencies that implies).
Can't rightly see why you'd bother here though, all of this seems reasonably easy to do with std::fs (and possibly a smattering of std::env) e.g.
for entry in fs::read_dir(".")?.take(5) {
let entry = entry?;
fs::rename(entry.path(), dest.join(entry.file_name()))?;
}
I am trying to run the following function
foo () {
sleep 1
echo "outside inotify"
(inotifywait . -e create |
while read path action file; do
echo "test"
sleep 1
done)
echo "end"
}
Until inotifywait it runs correctly; I see:
>> foo
outside inotify
Setting up watches.
Watches established.
However as soon as I create a file, I get
>>> fooo
outside inotify
Setting up watches.
Watches established.
test
foo:6: command not found: sleep
end
Any idea why? Plus do I need to spawn the subprocess ( ) around inotifywait? what are the benefits?
thank you.
Edit
I realized I am running on zsh
The read path is messing you up, because unlike POSIX-compliant shells -- which guarantee that only modification to variables with all-uppercase names can have unwanted side effects on the shell itself -- zsh also has special-cased behavior for several lower-case names, including path.
In particular, zsh presents path as an array corresponding to the values in PATH. Assigning a string to this array will overwrite your PATH as well.
I have a query regarding the execution of a complex command in the makefile of the current system.
I am currently using shell command in the makefile to execute the command. However my command fails as it is a combination of a many commands and execution collects a huge amount of data. The makefile content is something like this:
variable=$(shell ls -lart | grep name | cut -d/ -f2- )
However the make execution fails with execvp failure, since the file listing is huge and I need to parse all of them.
Please suggest me any ways to overcome this issue. Basically I would like to execute a complex command and assign that output to a makefile variable which I want to use later in the program.
(This may take a few iterations.)
This looks like a limitation of the architecture, not a Make limitation. There are several ways to address it, but you must show us how you use variable, otherwise even if you succeed in constructing it, you might not be able to use it as you intend. Please show us the exact operations you intend to perform on variable.
For now I suggest you do a couple of experiments and tell us the results. First, try the assignment with a short list of files (e.g. three) to verify that the assignment does what you intend. Second, in the directory with many files, try:
variable=$(shell ls -lart | grep name)
to see whether the problem is in grep or cut.
Rather than store the list of files in a variable you can easily use shell functionality to get the same result. It's a bit odd that you're flattening a recursive ls to only get the leaves, and then running mkdir -p which is really only useful if the parent directory doesn't exist, but if you know which depths you want to (for example the current directory and all subdirectories one level down) you can do something like this:
directories:
for path in ./*name* ./*/*name*; do \
mkdir "/some/path/$(basename "$path")" || exit 1; \
done
or even
find . -name '*name*' -exec mkdir "/some/path/$(basename {})" \;
I have a bunch of commands I would like to execute in parallel. The commands are nearly identical. They can be expected to take about the same time, and can run completely independently. They may look like:
command -n 1 > log.1
command -n 2 > log.2
command -n 3 > log.3
...
command -n 4096 > log.4096
I could launch all of them in parallel in a shell script, but the system would try to load more than strictly necessary to keep the CPU(s) busy (each task takes 100% of one core until it has finished). This would cause the disk to thrash and make the whole thing slower than a less greedy approach to execution.
The best approach is probably to keep about n tasks executing, where n is the number of available cores.
I am keen not to reinvent the wheel. This problem has already been solved in the Unix make program (when used with the -j n option). I was wondering if perhaps it was possible to write generic Makefile rules for the above, so as to avoid the linear-size Makefile that would look like:
all: log.1 log.2 ...
log.1:
command -n 1 > log.1
log.2:
command -n 2 > log.2
...
If the best solution is not to use make but another program/utility, I am open to that as long as the dependencies are reasonable (make was very good in this regard).
Here is more portable shell code that does not depend on brace expansion:
LOGS := $(shell seq 1 1024)
Note the use of := to define a more efficient variable: the simply expanded "flavor".
See pattern rules
Another way, if this is the single reason why you need make, is to use -n and -P options of xargs.
First the easy part. As Roman Cheplyaka points out, pattern rules are very useful:
LOGS = log.1 log.2 ... log.4096
all: $(LOGS)
log.%:
command -n $* > log.$*
The tricky part is creating that list, LOGS. Make isn't very good at handling numbers. The best way is probably to call on the shell. (You may have to adjust this script for your shell-- shell scripting isn't my strongest subject.)
NUM_LOGS = 4096
LOGS = $(shell for ((i=1 ; i<=$(NUM_LOGS) ; ++i)) ; do echo log.$$i ; done)
xargs -P is the "standard" way to do this.
Note depending on disk I/O you may want to limit to spindles rather than cores.
If you do want to limit to cores note the new nproc command in recent coreutils.
With GNU Parallel you would write:
parallel command -n {} ">" log.{} ::: {1..4096}
10 second installation:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
Learn more: http://www.gnu.org/software/parallel/parallel_tutorial.html https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Is there a case of ... or context where cat file | ... behaves differently than ... <file?
When reading from a regular file, cat is in charge of reading the data, performs it as it pleases, and might constrain it in the way it writes it to the pipeline. Obviously, the contents themselves are preserved, but anything else could be tainted. For example: block size and data arrival timing. Additionally, the pipe in itself isn't always neutral: it serves as an additional buffer between the input and ....
Quick and easy way to make the block size issue apparent:
$ cat large-file | pv >/dev/null
5,44GB 0:00:14 [ 393MB/s] [ <=> ]
$ pv <large-file >/dev/null
5,44GB 0:00:03 [1,72GB/s] [=================================>] 100%
Besides the thing posted by other users, when using input redirection from a file, standard input is the file but when piping the output of cat to the input, standard input is a stream with the contents of the file. When standard input is the file will be able to seek within the file but the pipe will not allow it. You can see this by finding a zip file and running the following commands:
zipinfo /dev/stdin < thezipfile.zip
and
cat thezipfile.zip | zipinfo /dev/stdin
The first command will show the contents of the zipfile while the second will show an error, though it is a misleading error because zipinfo does not check the result of the seek call and errors later on.
A useless use of cat is always to be avoided. It's like driving with the handbrake on. It wastes CPU cycles for nothing, the OS constantly context switching between the cat process and the next in the pipe. If all the world's useless cats were gone and stopped being invented, reinvented, passed on from father to son, we wouldn't have global warming because we could easily live with 1.21 Gigawatts of power saved.
Thanks. I feel better now. Please join me in my crusade to stamp out useless use of cat on stackoverflow. This site is, as far as I perceive it, a major contribution to the proliferation of useless cats. I don't blame the newbies, but I do want to teach them. Workers and newbies of the world, loosen the handbrakes and save the planet!!!1!
cat will allow you to pipe multiple files in sequentially. Otherwise, < redirection and cat file | produce the same side effects.
Pipes cause a subshell to be invoked for the command on the right. This interferes with environment variables.
cat foo | while read line
do
...
done
echo "$line"
versus
while read line
do
...
done < foo
echo "$line"
One further difference is behavior on a blocking open() of the input file.
For example, assuming input is a FIFO with no writers, one invocation will not spawn any child programs until the input file is opened, while the other will spawn two processes:
prog ... < a_fifo # 'prog' not launched until shell can open file
cat a_fifo | prog ... # 'prog' and 'cat' are running (latter may block on open)
In practice this rarely matters except in contrived circumstances. prog might periodically log or do some cleanup work while waiting for input, for example, which you might want to happen even if no input is available. (Why wouldn't prog be sophisticated enough to open its own input fifo nonblocking?)
cat file | starts up another program (cat) that doesn't have to start in the second case. It also makes it more confusing if you want to use "here documents". But it should behave the same.