UNIX: How to print contents of variable with formatting. - unix

I just want to print ls -l the same way it looks from the command line (each file on a new line). I have looked every where for a solution and know my solution should work but for some reason it doesn't. I have tried:
#!/bin/csh
set list = `ls -l`
echo "$list"
and:
#!/bin/csh
set list = "`ls -l`"
echo "$list"
with no luck. What I really want to do is use grep on ls -l later (so maby I'm going about this wrong), but I can't because it prints list as one long line.
(and yes, I have to use csh)

I don't know how to get around csh's behaviour of joining words together when you echo, but you may be able to use array-like functionality and a loop. For example:
#!/bin/csh
set list=( "`printf 'a\nb\nc\n'`" )
echo "count=$#list"
echo "2 = $list[2]"
echo
set n=0
while ( $n < $#list )
# n += 1
echo "$n : $list[$n]"
end
Which for me produces the output:
count=3
2 = b
1 : a
2 : b
3 : c
Note that I'm using tcsh on FreeBSD. Your csh may be different, you haven't mentioned your platform.
To bring this back to your list of files question, you can replicate the output you're looking for with a similar loop:
#!/bin/csh
set list=( "`ls -l`" )
set n=0
while ( $n < $#list )
# n += 1
echo "$list[$n]"
end
The important consideration here is that within (command substitution) backquotes (`...`), output is word-separated by whitespace, whereas inside double quotes ("..."), output is word-separated by newlines.
That said...
What I really want to do is use grep on ls -l later (so maby I'm going about this wrong), but I can't because it prints list as one long line.
Entirely possible! :-) But without a full understanding of the underlying problem you're trying to solve, helping you achieve your solution is the best we can do. Beware the dreaded XY Problem.

Related

Parsing variable in loop incorrectly [duplicate]

I want to run certain actions on a group of lexicographically named files (01-09 before 10). I have to use a rather old version of FreeBSD (7.3), so I can't use yummies like echo {01..30} or seq -w 1 30.
The only working solution I found is printf "%02d " {1..30}. However, I can't figure out why can't I use $1 and $2 instead of 1 and 30. When I run my script (bash ~/myscript.sh 1 30) printf says {1..30}: invalid number
AFAIK, variables in bash are typeless, so how can't printf accept an integer argument as an integer?
Bash supports C-style for loops:
s=1
e=30
for i in ((i=s; i<e; i++)); do printf "%02d " "$i"; done
The syntax you attempted doesn't work because brace expansion happens before parameter expansion, so when the shell tries to expand {$1..$2}, it's still literally {$1..$2}, not {1..30}.
The answer given by #Kent works because eval goes back to the beginning of the parsing process. I tend to suggest avoiding making habitual use of it, as eval can introduce hard-to-recognize bugs -- if your command were whitelisted to be run by sudo and $1 were, say, '$(rm -rf /; echo 1)', the C-style-for-loop example would safely fail, and the eval example... not so much.
Granted, 95% of the scripts you write may not be accessible to folks executing privilege escalation attacks, but the remaining 5% can really ruin one's day; following good practices 100% of the time avoids being in sloppy habits.
Thus, if one really wants to pass a range of numbers to a single command, the safe thing is to collect them in an array:
a=( )
for i in ((i=s; i<e; i++)); do a+=( "$i" ); done
printf "%02d " "${a[#]}"
I guess you are looking for this trick:
#!/bin/bash
s=1
e=30
printf "%02d " $(eval echo {$s..$e})
Ok, I finally got it!
#!/bin/bash
#BSD-only iteration method
#for day in `jot $1 $2`
for ((day=$1; day<$2; day++))
do
echo $(printf %02d $day)
done
I initially wanted to use the cycle iterator as a "day" in file names, but now I see that in my exact case it's easier to iterate through normal numbers (1,2,3 etc.) and process them into lexicographical ones inside the loop. While using jot, remember that $1 is the numbers amount, and the $2 is the starting point.

ZSH subString extraction

Goal
In ZSH script, for a given args, I want to obtain the first string and the rest.
For instance, when the script is named test
sh test hello
supposed to extract h and ello.
ZSH manual
http://zsh.sourceforge.net/Doc/zsh_a4.pdf
says:
Subscripting may also be performed on non-array values, in which case the subscripts specify a
substring to be extracted. For example, if FOO is set to ‘foobar’, then ‘echo $FOO[2,5]’ prints
‘ooba’.
Q1
So, I wrote a shell script in a file named test
echo $1
echo $1[1,1]
terminal:
$ sh test hello
hello
hello[1,1]
the result fails. What's wrong with the code?
Q2
Also I don't know how to extract subString from n to the last. Perhaps do I have to use Array split by regex?
EDIT: Q3
This may be another question, so if it's proper to start new Thread, I will do so.
Thanks to #skishore Here is the further code
#! /bin/zsh
echo $1
ARG_FIRST=`echo $1 | cut -c1`
ARG_REST=`echo $1 | cut -c2-`
echo ARG_FIRST=$ARG_FIRST
echo ARG_REST=$ARG_REST
if $ARG_FIRST = ""; then
echo nullArgs
else
if $ARG_FIRST = "#"; then
echo #Args
else
echo regularArgs
fi
fi
I'm not sure how to compare string valuables to string, but for a given args hello
result:
command not found: h
What's wrong with the code?
EDIT2:
What I've found right
#! /bin/zsh
echo $1
ARG_FIRST=`echo $1 | cut -c1`
ARG_REST=`echo $1 | cut -c2-`
echo ARG_FIRST=$ARG_FIRST
echo ARG_REST=$ARG_REST
if [ $ARG_FIRST ]; then
if [ $ARG_FIRST = "#" ]; then
echo #Args
else
echo regularArgs
fi
else
echo nullArgs
fi
EDIT3:
As the result of whole, this is what I've done with this question.
https://github.com/kenokabe/GitSnapShot
GitSnapShot is a ZSH thin wrapper for Git commands for easier and simpler usage
A1
As others have said, you need to wrap it in curly braces. Also, use a command interpreter (#!...), mark the file as executable, and call it directly.
#!/bin/zsh
echo $1
echo ${1[1,1]}
A2
The easiest way to extract a substring from a parameter (zsh calls variables parameters) is to use parameter expansion. Using the square brackets tells zsh to treat the scalar (i.e. string) parameter as an array. For a single character, this makes sense. For the rest of the string, you can use the simpler ${parameter:start:length} notation instead. If you omit the :length part (as we will here), then it will give you the rest of the scalar.
File test:
#!/bin/zsh
echo ${1[1]}
echo ${1:1}
Terminal:
$ ./test Hello
H
ello
A3
As others have said, you need (preferably double) square brackets to test. Also, to test if a string is NULL use -z, and to test if it is not NULL use -n. You can just put a string in double brackets ([[ ... ]]), but it is preferable to make your intentions clear with -n.
if [[ -z "${ARG_FIRST}" ]]; then
...
fi
Also remove the space between #! and /bin/zsh.
And if you are checking for equality, use ==; if you are assigning a value, use =.
RE:EDIT2:
Declare all parameters to set the scope. If you do not, you may clobber or use a parameter inherited from the shell, which may cause unexpected behavior. Google's shell style guide is a good resource for stuff like this.
Use builtins over external commands.
Avoid backticks. Use $(...) instead.
Use single quotes when quoting a literal string. This prevents pattern matching.
Make use of elif or case to avoid nested ifs. case will be easier to read in your example here, but elif will probably be better for your actual code.
Using case:
#!/bin/zsh
typeset ARG_FIRST="${1[1]}"
typeset ARG_REST="${1:1}"
echo $1
echo 'ARG_FIRST='"${ARG_FIRST}"
echo 'ARG_REST='"${ARG_REST}"
case "${ARG_FIRST}" in
('') echo 'nullArgs' ;;
('#') echo '#Args' ;;
(*)
# Recommended formatting example with more than 1 sloc
echo 'regularArgs'
;;
esac
using elif:
#!/bin/zsh
typeset ARG_FIRST="${1[1]}"
typeset ARG_REST="${1:1}"
echo $1
echo 'ARG_FIRST='"${ARG_FIRST}"
echo 'ARG_REST='"${ARG_REST}"
if [[ -z "${ARG_FIRST}" ]]; then
echo nullArgs
elif [[ '#' == "${ARG_FIRST}" ]]; then
echo #Args
else
echo regularArgs
fi
RE:EDIT3
Use "$#" unless you really know what you are doing. Explanation.
You can use the cut command:
echo $1 | cut -c1
echo $1 | cut -c2-
Use $() to assign these values to variables:
ARG_FIRST=$(echo $1 | cut -c1)
ARG_REST=$(echo $1 | cut -c2-)
echo ARG_FIRST=$ARG_FIRST
echo ARG_REST=$ARG_REST
You can also replace $() with backticks, but the former is recommended and the latter is somewhat deprecated due to nesting issues.
So, I wrote a shell script in a file named test
$ sh test hello
This isn't a zsh script: you're calling it with sh, which is (almost certainly) bash. If you've got the shebang (#!/bin/zsh), you can make it executable (chmod +x <script>) and run it: ./script. Alternatively, you can run it with zsh <script>.
the result fails. What's wrong with the code?
You can wrap in braces:
echo ${1} # This'll work with or without the braces.
echo ${1[3,5]} # This works in the braces.
echo $1[3,5] # This doesn't work.
Running this: ./test-script hello gives:
./test-script.zsh hello
hello
llo
./test-script.zsh:5: no matches found: hello[3,5]
Also I don't know how to extract subString from n to the last. Perhaps do I have to use Array split by regex?
Use the [n,last] notation, but wrap in braces. We can determine how long our variable is with, then use the length:
# Store the length of $1 in LENGTH.
LENGTH=${#1}
echo ${1[2,${LENGTH}]} # Display from `2` to `LENGTH`.
This'll produce ello (prints from the 2nd to the last character of hello).
Script to play with:
#!/usr/local/bin/zsh
echo ${1} # Print the input
echo ${1[3,5]} # Print from 3rd->5th characters of input
LENGTH=${#1}
echo ${1[2,${LENGTH}]} # Print from 2nd -> last characters of input.
You can use the cut command:
But that would be using extra baggage - zsh is quite capable of doing all this on it's own without spawning multiple sub-shells for simplistic operations.

How to read 1 symbol in zsh?

I need to get exactly one character from console and not print it.
I've tried to use read -en 1 as I did using bash. But this doesn't work at all.
And vared doesn't seem to have such option.
How to read 1 symbol in zsh? (I'm using zsh v.4.3.11 and v.5.0.2)
read -sk
From the documentation:
-s
Don’t echo back characters if reading from the terminal. Currently does not work with the -q option.
-k [ num ]
Read only one (or num) characters. All are assigned to the first name, without word splitting. This flag is ignored when -q is present. Input is read from the terminal unless one of -u or -p is present. This option may also be used within zle widgets.
Note that despite the mnemonic ‘key’ this option does read full characters, which may consist of multiple bytes if the option MULTIBYTE is set.
If you want your script to be a bit more portable you can do something like this:
y=$(bash -c "read -n 1 c; echo \$c")
read reads from the terminal by default:
% date | read -sk1 "?Enter one char: "; echo $REPLY
Enter one char: X
Note above:
The output of date is discarded
The X is printed by the echo, not when the user enters it.
To read from a pipeline, use file descriptor 0:
% echo foobar | read -rk1 -u0; echo $REPLY
f
% echo $ZSH_VERSION
5.5.1
Try something like
read line
c=`echo $line | cut -c1`
echo $c

grep -f maximum number of patterns?

I'd like to use grep on a text file with -f to match a long list (10,000) of patterns. Turns out that grep doesn't like this (who, knew?). After a day, it didn't produce anything. Smaller lists work almost instantaneously.
I was thinking I might split my long list up and do it a few times. Any idea what a good maximum length for the pattern list might be?
Also, I'm rather new with unix. Alternative approaches are welcome. The list of patterns, or search terms, are in a plaintext file, one per line.
Thank you everyone for your guidance.
From comments, it appears that the patterns you are matching are fixed strings. If that is the case, you should definitely use -F. That will increase the speed of the matching considerably. (Using 479,000 strings to match on an input file with 3 lines using -F takes under 1.5 seconds on a moderately powered machine. Not using -F, that same machine is not yet finished after several minutes.)
i got the same problem with approx. 4 million patterns to search for in a file with 9 million lines. Seems like it is a problem of RAM. so i got this neat little work around which might be slower than splitting and joining but it just need this one line.
while read line; do grep $line fileToSearchIn;done < patternFile
I needed to use the work around since the -F flag is no solution for that large files...
EDIT: This seems to be really slow for large files. After some more research i found 'faSomeRecords' and really other awesome tools from Kent NGS-editing-Tools
I tried it on my own by extracting 2 million fasta-rec from 5.5million records file. Took approx. 30 sec..
cheers
EDIT: direct download link
Here is a bash script you can run on your files (or if you would like, a subset of your files). It will split the key file into increasingly large blocks, and for each block attempt the grep operation. The operations are timed - right now I'm timing each grep operation, as well as the total time to process all the sub-expressions.
Output is in seconds - with some effort you can get ms, but with the problem you are having it's unlikely you need that granularity.
Run the script in a terminal window with a command of the form
./timeScript keyFile textFile 100 > outputFile
This will run the script, using keyFile as the file where the search keys are stored, and textFile as the file where you are looking for keys, and 100 as the initial block size. On each loop the block size will be doubled.
In a second terminal, run the command
tail -f outputFile
which will keep track of the output of your other process into the file outputFile
I recommend that you open a third terminal window, and that you run top in that window. You will be able to see how much memory and CPU your process is taking - again, if you see vast amounts of memory consumed it will give you a hint that things are not going well.
This should allow you to find out when things start to slow down - which is the answer to your question. I don't think there's a "magic number" - it probably depends on your machine, and in particular on the file size and the amount of memory you have.
You could take the output of the script and put it through a grep:
grep entire outputFile
You will end up with just the summaries - block size, and time taken, e.g.
Time for processing entire file with blocksize 800: 4 seconds
If you plot these numbers against each other (or simply inspect the numbers), you will see when the algorithm is optimal, and when it slows down.
Here is the code: I did not do extensive error checking but it seemed to work for me. Obviously in your ultimate solution you need to do something with the outputs of grep (instead of piping it to wc -l which I did just to see how many lines were matched)...
#!/bin/bash
# script to look at difference in timing
# when grepping a file with a large number of expressions
# assume first argument = name of file with list of expressions
# second argument = name of file to check
# optional third argument = initial block size (default 100)
#
# split f1 into chunks of 1, 2, 4, 8... expressions at a time
# and print out how long it took to process all the lines in f2
if (($# < 2 )); then
echo Warning: need at leasttwo parameters.
echo Usage: timeScript keyFile searchFile [initial blocksize]
exit 0
fi
f1_linecount=`cat $1 | wc -l`
echo linecount of file1 is $f1_linecount
f2_linecount=`cat $2 | wc -l`
echo linecount of file2 is $f2_linecount
echo
if (($# < 3 )); then
blockLength=100
else
blockLength=$3
fi
while (($blockLength < f1_linecount))
do
echo Using blocks of $blockLength
#split is a built in command that splits the file
# -l tells it to break after $blockLength lines
# and the block$blockLength parameter is a prefix for the file
split -l $blockLength $1 block$blockLength
Tstart="$(date +%s)"
Tbefore=$Tstart
for fn in block*
do
echo "grep -f $fn $2 | wc -l"
echo number of lines matched: `grep -f $fn $2 | wc -l`
Tnow="$(($(date +%s)))"
echo Time taken: $(($Tnow - $Tbefore)) s
Tbefore=$Tnow
done
echo Time for processing entire file with blocksize $blockLength: $(($Tnow - $Tstart)) seconds
blockLength=$((2*$blockLength))
# remove the split files - no longer needed
rm block*
echo block length is now $blockLength and f1 linecount is $f1_linecount
done
exit 0
You could certainly give sed a try to see whether you get a better result, but it is a lot of work to do either way on a file of any size. You didn't provide any details on your problem, but if you have 10k patterns I would be trying to think about whether there is some way to generalize them into a smaller number of regular expressions.
Here is a perl script "match_many.pl" which addresses a very common subset of the "large number of keys vs. large number of records" problem. Keys are accepted one per line from stdin. The two command line parameters are the name of the file to search and the field (white space delimited) which must match a key. This subset of the original problem can be solved quickly since the location of the match (if any) in the record is known ahead of time and the key always corresponds to an entire field in the record. In one typical case it searched 9400265 records with 42899 keys, matching 42401 of the keys and emitting 1831944 records in 41s. The more general case, where the key may appear as a substring in any part of a record, is a more difficult problem that this script does not address. (If keys never include white space and always correspond to an entire word the script could be modified to handle that case by iterating over all fields per record, instead of just testing the one, at the cost of running M times slower, where M is the average field number where the matches are found.)
#!/usr/bin/perl -w
use strict;
use warnings;
my $kcount;
my ($infile,$test_field) = #ARGV;
if(!defined($infile) || "$infile" eq "" || !defined($test_field) || ($test_field <= 0)){
die "syntax: match_many.pl infile field"
}
my %keys; # hash of keys
$test_field--; # external range (1,N) to internal range (0,N-1)
$kcount=0;
while(<STDIN>) {
my $line = $_;
chomp($line);
$keys {$line} = 1;
$kcount++
}
print STDERR "keys read: $kcount\n";
my $records = 0;
my $emitted = 0;
open(INFILE, $infile ) or die "Could not open $infile";
while(<INFILE>) {
if(substr($_,0,1) =~ /#/){ #skip comment lines
next;
}
my $line = $_;
chomp($line);
$line =~ s/^\s+//;
my #fields = split(/\s+/, $line);
if(exists($keys{$fields[$test_field]})){
print STDOUT "$line\n";
$emitted++;
$keys{$fields[$test_field]}++;
}
$records++;
}
$kcount=0;
while( my( $key, $value ) = each %keys ){
if($value > 1){
$kcount++;
}
}
close(INFILE);
print STDERR "records read: $records, emitted: $emitted; keys matched: $kcount\n";
exit;

UNIX command line argument referencing issues

I'm trying to tell unix to print out the command line arguments passed to a Bourne Shell script, but it's not working. I get the value of x at the echo statement, and not the command line argument at the desired location.
This is what I want:
./run a b c d
a
b
c
d
this is what I get:
1
2
3
4
What's going on? I know that UNIX is confused as per what I'm referencing in the shell script (the variable x or the command line argument at the x'th position". How can I clarify what I mean?
#!/bin/sh
x=1
until [ $x -gt $# ]
do
echo $x
x=`expr $x + 1`
done
EDIT: Thank you all for the responses, but now I have another question; what if you wanted to start counting not at the first argument, but at the second, or third? So, what would I do to tell UNIX to process elements starting at the second position, and ignore the first?
echo $*
$x is not the xth argument. It's the variable x, and expr $x+1 is like x++ in other languages.
The simplest change to your script to make it do what you asked is this:
#!/bin/sh
x=1
until [ $x -gt $# ]
do
eval "echo \${$x}"
x=`expr $x + 1`
done
HOWEVER (and this is a big however), using eval (especially on user input) is a huge security problem. A better way is to use shift and the first positional argument variable like this:
#!/bin/sh
while [ $# -gt 0 ]; do
x=$1
shift
echo ${x}
done
If you want to start counting a the 2nd argument
for i in ${#:2}
do
echo $i
done
A solution not using shift:
#!/bin/sh
for arg in "$#"; do
printf "%s " "$arg"
done
echo

Resources