tcsh reading columns of a text file into variables - global-variables

I have a text file in the that looks like:
Jose Santiago:385-898-8357:385-555-5555:38 Fife Way, Abilene, TX 39673:1/5/58:95600
Tommy Savage:408-724-0140:408-777-0121:1222 Oxbow Court, Sunnyvale, CA 94087:5/19/66:34200
Yukio Takeshida:387-827-1095:387-888-1198:13 Uno Lane, Ashville, NC 23556:7/1/29:57000
Vinh Tranh:438-910-7449:438-999-0000:8235 Maple Street, Wilmington, VM 29085:9/23/63:68900
I am trying to write a tcsh script that will read the text file and assign each colon delimited field to a variable, with the exception of the name, which I want set to two separate variables. I have tried several things, but can't get this to work. I'm sorry, but I'm a novice. Thanks in advance for any help.

Supposing the name variable will always have only two words separated by a space, you could first replace that space by a : and then use cut to get the field into variables:
sed 's/ /:/' <filename >new_file
set var1 = `cut -f1 -d ':' new_file`
set var2 = `cut -f2 -d ':' new_file`
set var3 = `cut -f3 -d ':' new_file`
etc... for each field in your file
PS: If you don't mind rewriting your original file, you can replace inline:
set -i.bak 's/ /:/' filename

Related

Change multiple filenames unix

I had to download 15GB of data and for some reason during the downloading process the filenames were messed up in a way so that instead of
test_file.txt
the filenames are doubled, so it's
test_file.txttest_file.txt
instead. My only idea was whether there is any way to count the letters and then rename each file with deleting the first/ or second half of the filename? The filenames are not consistent, so for example in the same folder there might also be files named
files_are_great.txtfiles_are_great.txt
so I'm struggling to find a way to loop over them.
Thanks a lot!
The command sed 's/\(.*\)\1/\1/' will replace all duplicated strings with the single string without requiring a certain part of the file name like .txt. It allows spaces in the string.
Example:
echo 'abc defabc def' | sed 's/^\(.*\)\1$/\1/'
prints
abc def
Explanation of the sed command:
^ anchors the pattern to the beginning of the line
.* is 0 or more occurrences of any character
\(...\) captures what matches the pattern in between
\1 is a reference to the first capture group, i.e. the text that was found before
$ anchors the search pattern to the end of the line
This results in a search pattern that matches a whole line that consists of any text followed by the same text.
\1 in the replacement is the same reference to the matched text, i.e. a single occurrence of the duplicated text.
Any input that does not match the pattern will remain unchanged.
Assuming you want to rename all files in the current directory you can use it like this
for file in *
do
new=$(echo $file|sed 's/\(.*\)\1/\1/')
[ "$file" = "$new" ] || mv "$file" "$new"
done
As the sed command does not change non-matching input, $new will be the same as $file for file names that don't consist of a duplicated string. This would result in an error message from mv. That's why the renaming will be skipped in this case.
Using sed
sed 's#\(\.txt\)#& #g'
Explanation: using \( \) we group the expression which can be accessed using &
Demo:
echo "files_are_great.txtfiles_are_great.txt" | sed 's#\(\.txt\)#& #g'
files_are_great.txt files_are_great.txt
For renaming:
for file_name in $(ls -1 *txt*txt)
do
new_file_name=$(echo $i |sed 's#\(\.txt\)#& #g' | cut -d' ' -f1)
mv $file_name $new_file_name
done

How to read nth line and mth field of text file in unix

Suppose i have | delimeted file,
Line1: 1|2|3|4
Line2: 5|6|7|8
Line3: 9|9|1|0
Now i need to read 3 field at second line which is 7 in above example how i can do that using Cut or Sed Command. I'm new to unix please help
A job for awk:
awk -F '|' 'NR==2{print $3}' file
or
awk -F '|' -v row=2 -v col=3 'NR==row{print $col}' file
Output:
7
This should work:
sed -n '2p' file |awk -F '|' '{print $3}'
This might work for you (GNU sed):
sed -rn '2s/^(([^|]*)\|?){3}.*/\2/p' file
Turn off automatic printing by setting the -n option, turn on easier regexp declaration by -r option. Use pattern matching and back references to replace the whole of the second line by the third field of the same line and print the result.
The address of the substitution command is limited to only the second line.
The regexp groups the non-delimited characters followed by a delimiter a specific number of times. The second group, only retains the non-delimited characters for the specific number. Each grouping is replaced by the next and so the last grouping is reported, the .* consumes the remainder of the line and so only the third field (contents of second group) is printed.
N.B. the delimiter would be present following the final column and is therefore optional \|?

Use of cut command in unix

Suppose I have string as follwing:
Rajat;Harshit Srivastava;Mayank 123;5
Now i want result as following using cut command
Rajat
Harshit Srivastava
Mayank 123
5
I have tried but cut is not working on string containing spaces.
man cut would tell you:
-d, --delimiter=DELIM
use DELIM instead of TAB for field delimiter
--output-delimiter=STRING
use STRING as the output delimiter the default is to use the
input delimiter
If you insist on using cut for changing the delimiters:
$ echo "Rajat;Harshit Srivastava;Mayank 123;5" | cut -d \; --output-delimiter=\ -f 1-
Rajat Harshit Srivastava Mayank 123 5
but instead you should use sed or tr or awk for it. Try man tr, for example.
Try this
echo "Rajat;Harshit Srivastava;Mayank 123;5" | sed 's/;/ /g'

Not able to assign sed output to a variable

I am not able to capture the output of sed & cut using together in a variable. Below is the code snippet of a script:
max=$(sed -n '1,${/$i/p;q;}' $file | cut -d "," -f2)
When I print the value of max it is showing blank. But the code line is working fine when I execute it in terminal only like below:
sed -n '1,${/$i/p;q;}' $file | cut -d "," -f2
I am not able to understand why the assignment is failing. Could anyone please help me out here?
Regards,
Sayantan
As stated in the comments (and worked out for OP):
In single quotes $i is not a variable, but end-of-line followed by the character i after the end-of-line (impossible).

Extract Middle Substring from a given String in Unix

I have a string in different ranges :
WATSON_AJAY_AB04_DOTHING.data
WATSON_NAVNEET_CK4_DOTHING.data
WATSON_PRASHANTH_KJ56_DOTHING.data
WATSON_ABHINAV_KD323_DOTHING.data
On these above string how can I extract
AB04,CK4,KJ56,KD323
in Unix?
echo "$string" | cut -d'_' -f3
You could use sed or grep for this task. But since the string is so simple, I dont think you will need to.
One method is to use the bash 'cut' command. Below is an example directly on the BASH shell/command line:
jimm#pi$ string='WATSON_AJAY_AB04_DOTHING.data'
jimm#pi$ cut -d '_' -f 3 <<< "$string"
AB04 <-- outputs the result directly
(edit: of course Lucas' answer above is also a quick 'one-liner' that does the same thing as above - he beat me to it) :)
The cut will take an _ character as the delimiter (the -d '_' part), then display the 3rd slice of the string (the -f 3 part).
Or, if you want to output that 3rd slice from a list of content (using your list above), you can write a simple BASH script.
First, save the lines above ('WATSON...etc') into something like text.txt. Then open up your favorite text editor and type:
#!/bin/sh
cut -d '_' -f 3 < $1
Save that script to some useful name like slice.sh, and make sure it is executable with something like chmod 775 slice.sh.
Then at the command line you can execute the script against your text file, and immediately get an output of those parts of the file you want (in this case the third set of text, separated by the _ character):
$ ./slice.sh text.txt
AB04
CK4
KJ56
KD323
Hope that helps! Bear in mind that the commands above may vary a bit, depending on the flavor of *nix you are using, but it should at least point you in the right direction.

Resources