Is it possible to set variables equal to expressions in UNIX? - unix

In Unix, how would one do this?
#!/bin/sh
x=echo "Hello" | grep '^[A-Z]'
I want x to take the value "Hello", but this script does not seem to work. What would be the proper way of spelling something like the above out?

You can use command substitution as:
x=$(echo "Hello" | grep '^[A-Z]')
You could also use the outdated back-quote style as:
x=`echo "Hello" | grep '^[A-Z]'`

you can also use shell internals without calling external tools, eg case/esac
str="Hello"
case "$str" in
[A-Z]* ) x=$str;;
esac

be sure that you are using expected regex supporting grep, grep has many variants across unixs.

Related

how can I highlight just one item from the ls output

real beginner in Unix commands so not sure if the following is actually possible but here goes.
Is it possible to highlight just one item in a ls output?
I.e.: in a directory I use the following
ls -l --color=auto
this lists 4 items in green
file1.xls
file2.xls
file3.xls
file4.xls
But I want to highlight a specific item, in this case file2.
Is this possible?
The ls program will not do this for you. But you could filter the results from ls through a custom script which modifies the text to highlight just one item. It would be simpler if no color was originally given; then you could match on the given filename (for example as the pattern in an awk script, or in a sed script) and modify just that one item, adding colors.
That is, certainly it is possible. Writing a sample script is a different question.
How you approach the problem depends on what you want from the output. If that is (literally) the output from ls with a single filename in color, then a script would be the normal approach. You could use grep as suggested in the other answer, which raises a few issues:
commenting on ls -l --color=auto makes it sound as if you are using GNU ls, hence likely using Linux. An appropriate tag for the question would be linux rather than unix. If you ask for unix, the answers should differ.
supposing that you are using Linux. Then likely you have GNU grep, which can do colors. That would let you do something like this:
ls -l | grep --color=always file2 |less -R
however, there is a known bug in GNU grep's use of color (see xterm FAQ "grep --color" does not show the right output).
using grep like this shows only the matching lines. For ls that might be a good choice. For matches in a manual page -- definitely not.
Alternatively, less (which is found more often on Unix systems than GNU grep) also can highlight matches (not in color) and would show the file you are looking for in context. You could do this:
ls -l | less -p file2
(Both grep and less use patterns aka regular expressions, but I left the example simple — read the documentation to learn more).
If you're a beginner I would strongly suggest you learn the grep command if you want to filter results - A Unix users best friend (mine anyway)
Use grep to only display the list items you want to see...
ls- l | grep "file2"
NOTE: This is no different to typing ls -l file2 by the way but your pattern could be expanded based on what you actually want displayed on the screen.
So if you had a directory full of files ".txt", ".xls", ".doc" and you wanted to only see ".doc" with the word "work" in the name (work1.doc) you could write:
ls -ls | grep "work" | grep "txt"
This would list work1.txt, work2.txt, work3.txt and so on.
This is a very basic example but I use grep extensively whilst in the unix shell and would advise using this to filter all results instead of colours.
A little side note using grep -v will show you everything but the pattern you give it
ls -l | grep -v ".txt" will show everything BUT .txt files.

Another grep advanced

Q1. I want to grep something like that:
grep -Ir --exclude-dir="some*dirs" "my-text" ~/somewhere
but I don't want to show the whole strings containing "my-text", I want to see only list of files.
Q2. I want to see list of files containing "my-text" but not containing "another-text". How to do that?
Sorry, but I could not find the answer in man grep, neither in google.
Q1. You mustn't have googled very hard on that one.
man grep
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match.
Q2. Unless you expect both patterns to be on the same line, you'll need multiple invocations of grep. Something like:
$ grep -l my-text | xargs grep -vl another-text

List only certain files in a directory matching the word BOZO and ending with either '123' or '456'

I'm trying to figure out how to get a list of file names for a file named BOZO but ending with ONLY 123 OR 456.
Files are:
BOZO12389,
BOZOand3
BOZOand456
BOZOand5
BOZOhello123
So the command should only display 'BOZOhello123' and 'BOZOand456'
I can't figure it out. I've tried all forms of LS and GREP that I can think of. The funny thing is, we tried to do it in class for about 10mins and no one could get it (including the instructor).
I did the following and it worked
ls BOZO*456 BOZO*123
Using shell's globs:
ls BOZO*{123,456}
Use regular expressions to help you. The command egrep should help, because it will allow you to use regular expressions.
You're searching for files of the kind BOZO456 and BOZO123
A period . is a wild card, allowing you to substitute for <anything>. The * will let you repeat it 0 or more times. By placing around 123 and 456 round brackets, you will simulate an OR.
Thus, you want any character repeated 0 or more times, followed by 123 or 456.
Example:
egrep "BOZO.*(456|123)" data
Thank you to Nathan Fellman for the help and edits.
You could also use find command :
find . \( -name "BOZO*123" -o -name "BOZO*456" \)
$ ll grep *BOZO* should work too!
This shouldn't be that hard. The most naive way is to ls the directory and then grep for only what you want:
$ ls *BOZO* | grep -e '123$' -e '456$'

grep a tab in UNIX

How do I grep tab (\t) in files on the Unix platform?
If using GNU grep, you can use the Perl-style regexp:
grep -P '\t' *
The trick is to use $ sign before single quotes. It also works for cut and other tools.
grep $'\t' sample.txt
I never managed to make the '\t' metacharacter work with grep.
However I found two alternate solutions:
Using <Ctrl-V> <TAB> (hitting Ctrl-V then typing tab)
Using awk: foo | awk '/\t/'
From this answer on Ask Ubuntu:
Tell grep to use the regular expressions as defined by Perl (Perl has
\t as tab):
grep -P "\t" <file name>
Use the literal tab character:
grep "^V<tab>" <filename>
Use printf to print a tab character for you:
grep "$(printf '\t')" <filename>
One way is (this is with Bash)
grep -P '\t'
-P turns on Perl regular expressions so \t will work.
As user unwind says, it may be specific to GNU grep. The alternative is to literally insert a tab in there if the shell, editor or terminal will allow it.
Another way of inserting the tab literally inside the expression is using the lesser-known $'\t' quotation in Bash:
grep $'foo\tbar' # matches eg. 'foo<tab>bar'
(Note that if you're matching for fixed strings you can use this with -F mode.)
Sometimes using variables can make the notation a bit more readable and manageable:
tab=$'\t' # `tab=$(printf '\t')` in POSIX
id='[[:digit:]]\+'
name='[[:alpha:]_][[:alnum:]_-]*'
grep "$name$tab$id" # matches eg. `bob2<tab>323`
There are basically two ways to address it:
(Recommended) Use regular expression syntax supported by grep(1). Modern grep(1) supports two forms of POSIX 1003.2 regex syntax: basic (obsolete) REs, and modern REs. Syntax is described in details on re_format(7) and regex(7) man pages which are part of BSD and Linux systems respectively. The GNU grep(1) also supports Perl-compatible REs as provided by the pcre(3) library.
In regex language the tab symbol is usually encoded by \t atom. The atom is supported by BSD extended regular expressions (egrep, grep -E on BSD compatible system), as well as Perl-compatible REs (pcregrep, GNU grep -P).
Both basic regular expressions and Linux extended REs apparently have no support for the \t. Please consult UNIX utility man page to know which regex language it supports (hence the difference between sed(1), awk(1), and pcregrep(1) regular expressions).
Therefore, on Linux:
$ grep -P '\t' FILE ...
On BSD alike system:
$ egrep '\t' FILE ...
$ grep -E '\t' FILE ...
Pass the tab character into pattern. This is straightforward when you edit a script file:
# no tabs for Python please!
grep -q ' ' *.py && exit 1
However, when working in an interactive shell you may need to rely on shell and terminal capabilities to type the proper symbol into the line. On most terminals this can be done through Ctrl+V key combination which instructs terminal to treat the next input character literally (the V is for "verbatim"):
$ grep '<Ctrl>+<V><TAB>' FILE ...
Some shells may offer advanced support for command typesetting. Such, in bash(1) words of the form $'string' are treated specially:
bash$ grep $'\t' FILE ...
Please note though, while being nice in a command line this may produce compatibility issues when the script will be moved to another platform. Also, be careful with quotes when using the specials, please consult bash(1) for details.
For Bourne shell (and not only) the same behaviour may be emulated using command substitution augmented by printf(1) to construct proper regex:
$ grep "`printf '\t'`" FILE ...
Use echo to insert the tab for you grep "$(echo -e \\t)"
grep "$(printf '\t')" worked for me on Mac OS X
A good choice is to use sed.
sed -n '/\t/p' file
Examples (works in bash, sh, ksh, csh,..):
[~]$ cat testfile
12 3
1 4 abc
xa c
a c\2
1 23
[~]$ sed -n '/\t/p' testfile
xa c
a c\2
[~]$ sed -n '/\ta\t/p' testfile
a c\2
(This answer has been edited following suggestions in comments. Thank you all)
use gawk, set the field delimiter to tab (\t) and check for number of fields. If more than 1, then there is/are tabs
awk -F"\t" 'NF>1' file
+1 way, that works in ksh, dash, etc: use printf to insert TAB:
grep "$(printf 'BEGIN\tEND')" testfile.txt
On ksh I used
grep "[^I]" testfile
The answer is simpler. Write your grep and within the quote type the tab key, it works well at least in ksh
grep " " *
Using the 'sed-as-grep' method, but replacing the tabs with a visible character of personal preference is my favourite method, as it clearly shows both which files contain the requested info, and also where it is placed within lines:
sed -n 's/\t/\*\*\*\*/g' file_name
If you wish to make use of line/file info, or other grep options, but also want to see the visible replacement for the tab character, you can achieve this by
grep -[options] -P '\t' file_name | sed 's/\t/\*\*\*\*/g'
As an example:
$ echo "A\tB\nfoo\tbar" > test
$ grep -inH -P '\t' test | sed 's/\t/\*\*\*\*/g'
test:1:A****B
test:2:foo****bar
EDIT: Obviously the above is only useful for viewing file contents to locate tabs --- if the objective is to handle tabs as part of a larger scripting session, this doesn't serve any useful purpose.
This works well for AIX. I am searching for lines containing JOINED<\t>ACTIVE
voradmin cluster status | grep JOINED$'\t'ACTIVE
vorudb201 1 MEMBER(g) JOINED ACTIVE
*vorucaf01 2 SECONDARY JOINED ACTIVE
You might want to use grep "$(echo -e '\t')"
Only requirement is echo to be capable of interpretation of backslash escapes.
These alternative binary identification methods are totally functional. And, I really like the one's using awk, as I couldn't quite remember the syntaxic use with single binary chars. However, it should also be possible to assign a shell variable a value in a POSIX portable fashion (i.e. TAB=echo "#" | tr "\100" "\011"), and then employ it from there everywhere, in a POSIX portable fashion; as well (i.e grep "$TAB" filename). While this solution works well with TAB, it will also work well other binary chars, when another desired binary value is used in the assignment (instead of the value for the TAB character to 'tr').
The $'\t' notation given in other answers is shell-specific -- it seems to work in bash and zsh but is not universal.
NOTE: The following is for the fish shell and does not work in bash:
In the fish shell, one can use an unquoted \t, for example:
grep \t foo.txt
Or one can use the hex or unicode notations e.g.:
grep \X09 foo.txt
grep \U0009 foo.txt
(these notations are useful for more esoteric characters)
Since these values must be unquoted, one can combine quoted and unquoted values by concatenation:
grep "foo"\t"bar"
You can also use a Perl one-liner instead of grep resp. grep -P:
perl -ne 'print if /\t/' FILENAME
You can type
grep \t foo
or
grep '\t' foo
to search for the tab character in the file foo. You can probably also do other escape codes, though I've only tested \n. Although it's rather time-consuming, and unclear why you would want to, in zsh you can also type the tab character, back to the begin, grep and enclose the tab with quotes.
Look for blank spaces many times [[:space:]]*
grep [[:space:]]*'.''.'
Will find something like this:
'the tab' ..
These are single quotations ('), and not double ("). This is how you make concatenation in grep. =-)

How to do a mass rename?

I need to rename files names like this
transform.php?dappName=Test&transformer=YAML&v_id=XXXXX
to just this
XXXXX.txt
How can I do it?
I understand that i need more than one mv command because they are at least 25000 files.
Easiest solution is to use "mmv"
You can write:
mmv "long_name*.txt" "short_#1.txt"
Where the "#1" is replaced by whatever is matched by the first wildcard.
Similarly #2 is replaced by the second, etc.
So you do something like
mmv "index*_type*.txt" "t#2_i#1.txt"
To rename index1_type9.txt to t9_i1.txt
mmv is not standard in many Linux distributions but is easily found on the net.
If you are using zsh you can also do this:
autoload zmv
zmv 'transform.php?dappName=Test&transformer=YAML&v_id=(*)' '$1.txt'
You write a fairly simple shell script in which the trickiest part is munging the name.
The outline of the script is easy (bash syntax here):
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i <modified name>
done
Modifying the name has many options. I think the easiest is probably an awk one-liner like
`echo $i | awk -F'=' '{print $4}'`
so...
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i `echo $i | awk -F'=' '{print $4}'`.txt
done
update
Okay, as pointed out below, this won't necessarily work for a large enough list of files; the * will overrun the command line length limit. So, then you use:
$ find . -name 'transform.php?dappName=Test&transformer=YAML&v_id=*' -prune -print |
while read
do
mv $reply `echo $reply | awk -F'=' '{print $4}'`.txt
done
Try the rename command
Or you could pipe the results of an ls into a perl regex.
You may use whatever you want to transform the name (perl, sed, awk, etc.). I'll use a python one-liner:
for file in 'transform.php?dappName=Test&transformer=YAML&v_id='*; do
mv $file `echo $file | python -c "print raw_input().split('=')[-1]"`.txt;
done
Here's the same script entirely in Python:
import glob, os
PATTERN="transform.php?dappName=Test&transformer=YAML&v_id=*"
for filename in glob.iglob(PATTERN):
newname = filename.split('=')[-1] + ".txt"
print filename, '==>', newname
os.rename(filename, newname)
Side note: you would have had an easier life saving the pages with the right name while grabbing them...
find -name '*v_id=*' | perl -lne'rename($_, qq($1.txt)) if /v_id=(\S+)/'
vimv lets you rename multiple files using Vim's text editing capabilities.
Entering vimv opens a Vim window which lists down all files and you can do pattern matching, visual select, etc to edit the names. After you exit Vim, the files will be renamed.
[Disclaimer: I'm the author of the tool]
I'd use ren-regexp, which is a Perl script that lets you mass-rename files very easily.
21:25:11 $ ls
transform.php?dappName=Test&transformer=YAML&v_id=12345
21:25:12 $ ren-regexp 's/transform.php.*v_id=(\d+)/$1.txt/' transform.php*
transform.php?dappName=Test&transformer=YAML&v_id=12345
1 12345.txt
21:26:33 $ ls
12345.txt
This should also work:
prfx='transform.php?dappName=Test&transformer=YAML&v_id='
ls $prfx* | sed s/$prfx// | xargs -Ipsx mv "$prfx"psx psx
this renamer command would do it:
$ renamer --regex --find 'transform.php?dappName=Test&transformer=YAML&v_id=(\w+)' --replace '$1.txt' *
Ok, you need to be able to run a windows binary for this.
But if you can run Total Commander, do this:
Select all files with *, and hit ctrl-M
In the Search field, paste "transform.php?dappName=Test&transformer=YAML&v_id="
(Leave Replace empty)
Press Start
It doesn't get much simpler than that.
You can also rename using regular expressions via this dialog, and you see a realtime preview of how your files are going to be renamed.

Resources