Function to create the array by reading the file - unix

I am creating scripts which will store the contents of pipe delimited file. Each column is stored in a separate array. I then read the information from the arrays and process it. There are 20 pipe delimited files and I need to write 20 scripts. The processing that will happen in each script after the information is stored in the array is different. The number of columns in each pipe delimited file is different (but in no case it would be more than 9 columns). I need to do this activity of storing the information in the array in the beginning of each script. The way I am doing it at present is given below. I want help from you to understand how can I write a function to do this activity.
cat > example_file.txt <<End-of-message
some text first row|other text first row|some other text first row
some text nth row|other text nth row|some other text nth row
End-of-message
# Note that example_file.txt will available. I have created it inside the script just to let you know the format of the file
OIFS=$IFS
IFS='|'
i=0
while read -r first second third ignore
do
first_arr[$i]=$first
second_arr[$i]=$second
third_arr[$i]=$third
(( i=i+1 ))
done < example_file.txt
IFS=$OIFS

Here is a sort-of minimal change to your script that should get you further...
...
...
while read -r first second third ignore
do
arr0[$i]=$first
arr1[$i]=$second
arr2[$i]=$third
(( i=i+1 ))
done < example_file.txt
IFS=$OIFS
proc0 () {
for j in "$#"; do
echo proc0 : "$j"
done
}
proc1 () {
echo proc1
}
proc2 () {
echo proc2
}
for i in 0 1 2; do
t=arr$i'[#]'
proc$i "${!t}"
done

Related

Need of awk command explaination

I want to know how the below command is working.
awk '/Conditional jump or move depends on uninitialised value/ {block=1} block {str=str sep $0; sep=RS} /^==.*== $/ {block=0; if (str!~/oracle/ && str!~/OCI/ && str!~/tuxedo1222/ && str!~/vprintf/ && str!~/vfprintf/ && str!~/vtrace/) { if (str!~/^$/){print str}} str=sep=""}' file_name.txt >> CondJump_val.txt
I'd also like to know how to check the texts Oracle, OCI, and so on from the second line only. 
The first step is to write it so it's easier to read
awk '
/Conditional jump or move depends on uninitialised value/ {block=1}
block {
str=str sep $0
sep=RS
}
/^==.*== $/ {
block=0
if (str!~/oracle/ && str!~/OCI/ && str!~/tuxedo1222/ && str!~/vprintf/ && str!~/vfprintf/ && str!~/vtrace/) {
if (str!~/^$/) {
print str
}
}
str=sep=""
}
' file_name.txt >> CondJump_val.txt
It accumulates the lines starting with "Conditional jump ..." ending with "==...== " into a variable str.
If the accumulated string does not match several patterns, the string is printed.
I'd also like to know how to check the texts Oracle, OCI, and so on from the second line only.
What does that mean? I assume you don't want to see the "Conditional jump..." line in the output. If that's the case then use the next command to jump to the next line of input.
/Conditional jump or move depends on uninitialised value/ {
block=1
next
}
perhaps consolidate those regex into a single chain ?
if (str !~ "oracle|OCI|tuxedo1222|v[f]?printf|vtrace") {
print str
}
There are two idiomatic awkisms to understand.
The first can be simplified to this:
$ seq 100 | awk '/^22$/{flag=1}
/^31$/{flag=0}
flag'
22
23
...
30
Why does this work? In awk, flag can be tested even if not yet defined which is what the stand alone flag is doing - the input is only printed if flag is true and flag=1 is only executed when after the regex /^22$/. The condition of flag being true ends with the regex /^31$/ in this simple example.
This is an idiom in awk to executed code between two regex matches on different lines.
In your case, the two regex's are:
/Conditional jump or move depends on uninitialised value/ # start
# in-between, block is true and collect the input into str separated by RS
/^==.*== $/ # end
The other 'awkism' is this:
block {str=str sep $0; sep=RS}
When block is true, collect $0 into str and first time though, RS should not be added in-between the last time. The result is:
str="first lineRSsecond lineRSthird lineRS..."
both depend on awk being able to use a undefined variable without error

How to get the variable's name from a file using source command in UNIX?

I have a file named param1.txt which contains certain variables. I have another file as source1.txt which contains place holders. I want to replace the place holders with the values of the variables that I get from the parameter file.
I have basically hard coded the script where the variable names in the parameter.txt file is known before hand. I want to know a dynamic solution to the problem where the variable names will not be known beforehand. In other words, is there any way to find out the variable names in a file using the source command in UNIX?
Here is my script and the files.
Script:
#!/bin/bash
source /root/parameters/param1.txt
sed "s/{DB_NAME}/$DB_NAME/gI;
s/{PLANT_NAME}/$PLANT_NAME/gI" \
/root/sources/source1.txt >
/root/parameters/Output.txt`
param1.txt:
PLANT_NAME=abc
DB_NAME=gef
source1.txt:
kdashkdhkasdkj {PLANT_NAME}
jhdbjhasdjdhas kashdkahdk asdkhakdshk
hfkahfkajdfk ljsadjalsdj {PLANT_NAME}
{DB_NAME}
I cannot comment since I don't have enough points.
But is it correct that this is what you're looking for:
How to reference a file for variables using Bash?
Your problem statement isn't very clear to me. Perhaps you can simplify your problem and desired state.
Don't understand why you try to source param1.txt.
You can try with this awk :
awk '
NR == FNR {
a[$1] = $2
next
}
{
for ( i = 1 ; i <= NF ; i++ ) {
b = $i
gsub ( "^{|}$" , "" , b )
if ( b in a )
sub ( "{" b "}" , a[b] , $i )
}
} 1' FS='=' param1.txt FS=" " source1.txt

Write into tcl dictionary

I am relatively new to tcl dictionaries and don't see a good documentation on how to initialize an empty dictionary, loop over a log and save data into it. Finally I want to print a table that looks like this:
- Table:
HEAD1
Step 1 Start Time End Time
Step 2 Start Time End Time
**
- Log:
**
HEAD1
Step1
Start Time : 10am
.
.
.
End Time: 11am
Step2
Start Time : 11am
.
.
End time : 12pm
HEAD2
Step3
Start Time : 12pm
.
.
.
End Time: 1pm
Step4
Start Time : 1pm
.
.
End time : 2pm
You really don't have to initialise an empty dictionary in Tcl - you can simply start using it and it will get populated as you go along. As mentioned already, dict man page is the best way to start.
Additionally, I would suggest you check the regexp man page as you can use it nicely to parse your text file.
Not having anything better to do atm, I cobbled together a short sample code that should get you started. Use it as a starting tip, adjust it to your particular log layout and add some defensive measures to prevent errors from unexpected input.
# The following line is not strictly necessary as Tcl does not
# require you to first create an empty dictionary.
# You can simply start using 'dict set' commands below and the first
# one will create a dictionary for you.
# However, declaring something as a dict does add to code clarity.
set tableDict [dict create]
# Depending on your log sanity, you may want to declare some defaults
# so as to avoid errors in case the log file misses one of the expected
# lines (e.g. 'HEADx' is missing).
set headNumber {UNKNOWN}
set stepNumber {UNKNOWN}
set start {UNKNOWN}
set stop {UNKNOWN}
# Now read the file line by line and extract the interesting info.
# If the file indeed contains all of the required lines and exactly
# formatted as in your example, this should work.
# If there are discrepancies, adjust regex accordingly.
set log [open log.txt]
while {[gets $log line] != -1} {
if {[regexp {HEAD([0-9]+)} $line all value]} {
set headNumber $value
}
if {[regexp {Step([0-9]+)} $line all value]} {
set stepNumber $value
}
if {[regexp {Start Time : ([0-9]+(?:am|pm))} $line all value]} {
set start $value
}
# NOTE: I am assuming that your example was typed by hand and all
# inconsistencies stem from there. Otherwise, you have to adjust
# the regular expressions as 'End Time' is written with varying
# capitalization and with inconsistent white spaces around ':'
if {[regexp {End Time : ([0-9]+(?:am|pm))} $line all value]} {
set start $value
# NOTE: This short example relies heavily on the log file
# being formatted exactly as described. Therefore, as soon
# as we find 'End Time' line, we assume that we already have
# everything necessary for the next dictionary entry
dict set tableDict HEAD$headNumber Step$stepNumber StartTime $start
dict set tableDict HEAD$headNumber Step$stepNumber EndTime $stop
}
}
close $log
# You can now get your data from the dictionary and output your table
foreach head [dict keys $tableDict] {
puts $head
foreach step [dict keys [dict get $tableDict $head]] {
set start [dict get $tableDict $head $step StartTime]
set stop [dict get $tableDict $head $step EndTime]
puts "$step $start $stop"
}
}

SAS Input Statement

I have an autoexec file that encrypts my password when I'm connecting to different servers....the code looks as follows:
%global wspwd ewspwd hpwd ehpwd ;
/* Enter WORKSTATION Password Below */
filename ewspwdfl "/home/&sysuserid./ewspwd.txt" ;
proc pwencode in=’XXXXXXXX’ out=ewspwdfl ; run ;
data _null_ ;
infile ewspwdfl obs=1 length=l ;
input # ;
input #1 line1 $varying1024. l ;
call symput('ewspwd',cats(substr(line1,1,l))) ;
call symput('wspwd',cats(‘XXXXXXXX’)) ;
run ;
My question is: why is
input # ;
included and why
input #1 line1 $varying1024. l ;
doesn't suffice.
Whenever I have created datasets with SAS I have never had to include "input #;" in my statement. I just simply write something along the lines of:
input #1 firstname $ #15 lastname $ #30 date mmddyy6.;
You don't need it for that data step. You could simplify it to.
data _null_ ;
infile ewspwdfl obs=1 TRUNCOVER ;
input line1 $CHAR1024. ;
call symputX('ewspwd',line1);
call symputX('wspwd',‘XXXXXXXX’) ;
run ;
Using input # is a good way to create a program where you want to read different lines using different input statements. You could test the content of the _infile_ variable and execute different parts of the data step based on what is read.
It is also useful when using the EOV= option on the INFILE statement to detect when you are starting to read from a new file, since it is not set until you begin reading the new file. So the INPUT # gets SAS to begin reading so that the EOV variable is set, but keeps the line waiting for your real INPUT statement to read later.
The #1 is useful if you want to read the same columns over again into different variables. For example you might want to read the first few characters as a string and then test them and based on what you find re-read as a number or a date.

pattern matching and delete all the lines except the last occurence

I have a txt file which is having 100+ lines, i want to search for pattern and delete all the lines except the last occurrence.
Here are the lines from the txt file.
my pattern search is "string1=" , "string2=", "string3=" , "string4=" and "string5="
string1=hi
string2=hello
string3=welcome
string3=welcome1
string3=
string4=hi
string5=hello
i want to go through the each line and keep "string3=" is empty on the file and remove the "string3=welcome" ,"string3=welcome1"
please help me.
For a single pattern, you can start with something like this:
grep "string3" input | tail -1
#!/usr/bin/perl
my %h;
while (<STDIN>) {
my ($k, $v) = split /=/;
$h{$k} = $v;
}
foreach my $k ( sort keys %h ) {
print "$k=$h{$k}";
}
The perl script here will take your list as stdin and process output as you mention. This assumes you want the keys (string*) as sorted output.
If you only wants the values that start with string1-5 only then you can put a match in the beginning of your while loop as so:
next if ! /^string[1-5]=/;

Resources