want to change csv file date column format - unix

Im newbie to batch/shell scripting. I have a CSV file like this:
Id depId Name city Date prod
12345 52845 ken LA 08.08.2013 16:06:53 KLS22
25685 28725 Larry MA 09.03.2013 16:06:58 KLt35
58345 28545 ken LA 06.08.2013 16:06:53 KLS22
75885 98725 Gow CA 05.04.2013 16:06:58 KLt35
about 2000 records. col are delimited by tab. I would like to change the date column to the format:
DD_MM_YYY_hh_mm_ss
I have tried something like this with awk:
awk -F '' '{ ("date -d \""$5"\" \"+%Y:%m/%d %T\"") | getline $5; print }' myfile.csv
but i get wrong output.
I expect output like this:
Id depId Name city Date prod
58345 28545 ken LA 03_06_2013_23_00_00 KLS22
75885 98725 Gow CA 05_06_2013_23_00_00 KLt35
Please help out! Thanks!!

One way with awk:
$ awk 'NR>1{gsub(/\./,"_",$5);gsub(/:/,"_",$6);$5=$5"_"$6;$6=$NF;NF--}{$1=$1}1' OFS="\t" myfile.csv
Test:
$ cat temp
Id depId Name city Date prod
12345 52845 ken LA 8.8.2013 16:06:53 KLS22
25685 28725 Larry MA 9.3.2013 16:06:58 KLt35
58345 28545 ken LA 6.8.2013 16:06:53 KLS22
75885 98725 Gow CA 5.4.2013 16:06:58 KLt35
$ awk 'NR>1{gsub(/\./,"_",$5);gsub(/:/,"_",$6);$5=$5"_"$6;$6=$NF;NF--}{$1=$1}1' OFS="\t" temp
Id depId Name city Date prod
12345 52845 ken LA 8_8_2013_16_06_53 KLS22
25685 28725 Larry MA 9_3_2013_16_06_58 KLt35
58345 28545 ken LA 6_8_2013_16_06_53 KLS22
75885 98725 Gow CA 5_4_2013_16_06_58 KLt35

A simpler approach which does not check if the string that looks like the date is really in the right columnt:
$ perl -pe 's/\t(\d)\.(\d)\.(\d\d\d\d) /sprintf("\t%04d-%02d-%02d ", $3, $2, $1) /e' t.csv
12345 52845 ken LA 2013-08-08 16:06:53 KLS22
25685 28725 Larry MA 2013-03-09 16:06:58 KLt35

Related

Trying to copy data from vertica table into csv using vsql

Was trying to export data from vertica table to CSV file, but some data contain comma "," in values which pushed to other column.
vsql -h server_address -p 5433 -U username -w password -F $',' -A -o sadumpfile_3.csv -c
"select full_name from company_name;" -P footer=off
Vertica table data and expected csv:
full_name
----------
Samsun pvt, ltd
Apple inc
abc, pvt ltd
Ouput sadumpfile_3.csv
full_name
------------- ---------
Samsunpvt ltd
Apple inc
abc pvt ltd
Thanks in advance
Default behaviour (I have the four environment variables VSQL_USER, VSQL_PASSWORD, VSQL_HOST and VSQL_DATABASE set):
marco ~/1/Vertica/supp $ vsql -c "select full_name from company_name"
full_name
-----------------
Apple inc
Samsun pvt, ltd
abc, pvt ltd
(3 rows)
The simplest way to achieve what you were trying:
marco ~/1/Vertica/supp $ vsql -F ',' -A -c "select full_name from company_name;" -Pfooter
full_name
Apple inc
Samsun pvt, ltd
abc, pvt ltd
Note that the only commas are the ones already existing in the strings. If you only export one column, there's no field delimiter in the output.
I can only suppose that you want to have the output so that you can, for example, import it into Excel as CSV. If the field delimiter exists in a string, you would need to enclose the string with (usually double) quotes.
Vertica has a function that encloses a string with double quotes: QUOTE_IDENT():
marco ~/1/Vertica/supp $ vsql -F ',' -A \
-c "select QUOTE_IDENT(full_name) AS full_name from company_name;" -Pfooter
full_name
"Apple inc"
"Samsun pvt, ltd"
"abc, pvt ltd"

Get Numeric Filename Suffix for Unix Split

The Unix Split command produces filenames with the suffix aa to zz, there is no way to change the split command, the only way is to change the filenames afterward.
I would like to ask a way to change the suffix aa to zz into 001, 002, ...
Anyone could help?
typeset -i loopI
typeset -Z3 newSuf
typeset -i numFile
set -A suf aa ab ac ad ae af ag ah ai aj ak al am an ao ap aq ar as at au av aw ax ay az
numFile=`echo ${#suf[*]}`
numFile=$numFile-1
fileName=foo
split -100 $fileName.txt $fileName.
loopI=0
cont=y
while [ $cont = "y" ]; do
newSuf=`expr $loopI + 1`
mv $fileName.${suf[$loopI]} $fileName.$newSuf
loopI=$loopI+1
if [ $loopI -le $numFile ]; then
if [ ! -f $fileName.${suf[$loopI]} ]; then
cont=n
fi
else
cont=n
fi
done

Extract date from a text document in R

I am again here with an interesting problem.
I have a document like shown below:
"""UDAYA FILLING STATION ps\na MATTUPATTY ROAD oe\noe 4 MUNNAR Be:\nSeat 4 04865230318 Rat\nBree 4 ORIGINAL bepas e\n\noe: Han Die MC DE ER DC I se ek OO UO a Be ten\" % aot\n: ag 29-MAY-2019 14:02:23 [i\n— INVOICE NO: 292 hee fos\nae VEHICLE NO: NOT ENTERED Bea\nss NOZZLE NO : 1 ome\n- PRODUCT: PETROL ae\ne RATE : 75.01 INR/Ltr yee\n“| VOLUME: 1.33 Ltr ae\n~ 9 =6AMOUNT: 100.00 INR mae wae\nage, Ee pel Di EE I EE oe NE BE DO DC DE a De ee De ae Cate\notome S.1T. No : 27430268741C =. ver\nnes M.S.T. No: 27430268741V ae\n\nThank You! Visit Again\n""""
From the above document, I need to extract date highlighted in bold and Italics.
I tried with strpdate function but did not get the desired results.
Any help will be greatly appreciated.
Thanks in advance.
Assuming you only want to capture a single date, you may use sub here:
text <- "UDAYA FILLING STATION ps\na MATTUPATTY ROAD oe\noe 4 MUNNAR Be:\nSeat 4 04865230318 Rat\nBree 4 ORIGINAL bepas e\n\noe: Han Die MC DE ER DC I se ek OO UO a Be ten\" % aot\n: ag 29-MAY-2019 14:02:23 [i\n— INVOICE NO: 292 hee fos\nae VEHICLE NO: NOT ENTERED Bea\nss NOZZLE NO : 1 ome\n- PRODUCT: PETROL ae\ne RATE : 75.01 INR/Ltr yee\n“| VOLUME: 1.33 Ltr ae\n~ 9 =6AMOUNT: 100.00 INR mae wae\nage, Ee pel Di EE I EE oe NE BE DO DC DE a De ee De ae Cate\notome S.1T. No : 27430268741C =. ver\nnes M.S.T. No: 27430268741V ae\n\nThank You! Visit Again\n"
date <- sub("^.*\\b(\\d{2}-[A-Z]+-\\d{4})\\b.*", "\\1", text)
date
[1] "29-MAY-2019"
If you had the need to match multiple such dates in your text, then you may use regmatches along with regexec:
text <- "Hello World 29-MAY-2019 Goodbye World 01-JAN-2018"
regmatches(text,regexec("\\b(\\d{2}-[A-Z]+-\\d{4})\\b", text))[[1]]
[1] "29-MAY-2019" "29-MAY-2019"

Sort column2 Unix

I have a problem with the Sort command in unix
I have a text file that contains line like this:
de la (-0.167969404167593)
de l (-0.137148984295644)
la commission (0.0922090559997898)
à la (-0.115188946405936)
à l (-0.0936395578796088)
c est (0.130628584805583)
I want to sort these sentence according to a descending order of the values ​​in parenthesis
I do this commande :
sort 2fr -t"(" -k2r > 2frsort
But it does not the sort correctly
Any idea please?
Thanks
You can use numeric sort on the second column:
$ sort --numeric-sort --field-separator '(' --reverse --key 2 test.dat
c est (0.130628584805583)
la commission (0.0922090559997898)
à l (-0.0936395578796088)
à la (-0.115188946405936)
de l (-0.137148984295644)
de la (-0.167969404167593)
Are you using GNU sort? If so, -g option handles the exponential notation
sort -t '(' -k 2 -g -r

Merging two files horizontally and formatting

I have two files as follows:
File_1
Austin
Los Angeles
York
San Ramon
File_2
Texas
California
New York
California
I want to merge them horizontally as follows:
Austin Texas
Los Angeles California
York New York
San Ramon California
I am able to merge horizontally by using paste command, but the formatting is going haywire.
Austin Texas
Los Angeles California
York New York
San Ramon California
I realize that paste is working as it is supposed to, but can someone point me in the right direction to get the formatting right.
Thanks.
paste is using a tab when 'merging' the file, so maybe you have to post-process the file and remove the tab with spaces:
paste File_1 File_2 | awk 'BEGIN { FS = "\t" } ; {printf("%-20s%s\n",$1,$2) }'
result:
Austin Texas
Los Angeles California
York New York
San Ramon California
Firstly you have to check number of characters in the longest line. Than you may use fmt to pad line from the first file to greater length. Finish it using paste.
If you have an idea about the field width, you could do something like this:
IFS_BAK="$IFS"
IFS=$'\t'
paste file_1 file_2 \
| while read city state; do
printf "%-15s %-15s\n" "$city" "$state"
done
IFS="$IFS_BAK"
Or this shorter version:
paste file_1 file_2 | while IFS=$'\t' read city state; do
printf "%-15s %-15s\n" "$city" "$state"
done
Or use the column tool from bsdmainutils:
paste file_1 file_2 | column -s $'\t' -t

Resources