How can I insert a column in numeric comma separated input? - unix

Hi i have as text file below
input
326783,326784,402
326783,0326784,402
503534,503535,403
503534,0503535,403
429759,429758,404
429759,0429758,404
409626,409627,405
409626,0409627,405
369917,369916,402
369917,0369916,403
i want to convert it like below
condition :
1)input file column 3 and column 1 should be be same for 326784 and 0326784 and like that so on
2)if it different like the above input file last case then it should be printed in last line
output should be
326783,326784,0326784,402
503534,503535,0503535,403
429759,429758,0429758,404
409626,409627,0409627,405
369917,369916,402
369917,0369916,403
i am using solaris platform
please help me

I don't understand the logic of your computation, but some general advice: the unix tool awk can do such computations. It understands comma-separated files and you can get it to output other comma-separated files, manipulated by your logic (which you'll have to express in awk syntax).
This is, as I understand it, the unix way to do it.
The way I'd do it (being a non-expert on awk and just mentioning it for completeness ;) would be to write a little python script.
you want to
open an input and an output file
get each line from the input file
parse the integers
perform your logic
write integers to your output file
unchecked python-like code:
f_in = open("input", "r")
f_out = open("output", "w")
for line in f_in.readlines():
ints = [int(x) for x in line.split(",")]
f_out.write("%d, %d, %d\n" % (ints[0], ints[1], ints[0]+ints[1]))
f_in.close()
f_out.close()
Here, the logic is in the f_out.write(...) line (this example would output the first, the second and the sum of both input integers)
You can check if you have a Python interpreter at hand by simply typing python and seeing what happens. If you have, save your code into something.py and start it with "python something.py"

Related

Optimized way to write a series strings to a text file without quotations

I am new to Julia so sorry if this question is obvious.
I am trying to use Julia to help me run a series of finite element models, which use a text input file to give instructions to the finite element solver. Basically, I would like to use Julia to read in the base input file, edit some parameters on some lines of the file and then write it as a new file. I am getting hung up on a couple things though.
Currently, I am reading in the file like this
mdl = "fullmodelSVTV"; #name of input file
A = readlines(mdl*".inp")
This read each line from the file in as a separate string in a vector which I like because it makes it easier to edit the sections I want but it also makes things more difficult when I try to write to a new file.
I am writing the file like this.
io = open("name.inp","w")
print(io,A)
close(io)
When I try to write to a new file the output ends up look like this
Output from code
which is ["string at index 1","string at index 2","string at index 3"...].
What I would like to do is output this the exact same way is it is read in with string at each index of the vector on its own line. I would also like to remove the brackets and quotation marks from the file, as they might interfere with the finite element solver.
I think I have found a way to concatenate all of the strings at each index and separated them with a new line like shown below.
for i in 1:length(A)
conc = conc*"\n"*lines[i]
end
The issue with this is that it takes a long time to do given the size of the input files I am working with and I feel like there has to achieve my goal.
I also cannot find a way to remove the brackets or quotation marks when writing the file.
So, I'm wondering if anyone has any advice for a better way to write these text files in terms of both concatenating all of the strings from the vector when outputting as well as outputting without the brackets and quotation marks.
Thanks, any advice is appreciated.
The issue with print(io,A) is that it is printing a representation of the vector, but in fact you want to print each element of the vector. To do so, you can simply print each line in a loop:
open("name.inp", "w") do io
for line in A
println(io, line)
end
end
This avoids the overhead of string concatenation.

Remove line feed in CSV using Unix script

I have a CSV file and I want to remove the all line feeds (LF or \n) which are all coming in between the double quotes alone.
Can you please provide me an Unix script to perform the above task. I have given the input and expected output below.
Input :
No,Status,Date
1,"Success
Error",1/15/2018
2,"Success
Error
NA",2/15/2018
3,"Success
Error",3/15/2018
Expected output:
No,Status,Date
1,"Success Error",1/15/2018
2,"Success Error NA",2/15/2018
3,"Success Error",3/15/2018
I can't write everything for you, as I am not sure about your system as well as which bash version that is running on it. But here are a couple of suggestions that you might want to consider.
https://www.unix.com/shell-programming-and-scripting/31021-removing-line-breaks-shell-variable.html
https://www.unix.com/shell-programming-and-scripting/19484-remove-line-feeds.html
How to remove carriage return from a string in Bash
https://unix.stackexchange.com/questions/57124/remove-newline-from-unix-variable
Remove line breaks in Bourne Shell from variable
https://unix.stackexchange.com/questions/254644/how-do-i-remove-newline-character-at-the-end-of-file
https://serverfault.com/questions/391360/remove-line-break-using-awk

readcsv fails to read # character in Julia

I've been using asd=readcsv(filename) to read a csv file in Julia.
The first row of the csv file contains strings which describe the column contents; the rest of the data is a mix of integers and floats. readcsv reads the numbers just fine, but only reads the first 4+1/2 string entries.
After that, it renders "". If I ask the REPL to display asd[1,:], it tells me it is 1x65 Array{Any,2}.
The fifth column in the first row of the csv file (this seems to be the entry it chokes on) is APP #1 bias voltage [V]; but asd[1,5] is just APP . So it looks to me as though readcsv has choked on the "#" character.
I tried using "quotes=false" keyword in readcsv, but it didn't help.
I used to use xlsread in Matlab and it worked fine.
Has anybody out there seen this sort of thing before?
The comment character in Julia is #, and this applies when reading files from delimited text files.
But luckily, the readcsv() and readdlm() functions have an optional argument to help in these situations.
You should try readcsv(filename; comment_char = '/').
Of course, the example above assumes that you don't have any / characters in your first line. If you do, then you'll have to change that / above to something else.

.ksh paste user input value into dataset

Good morning.
First things first: I know next to nothing about shell scripting in Unix, so please pardon my naivety.
Here's what I'd like to do, and I think it's relatively simple: I would like to create a .ksh file to do two things: 1) take a user-provided numerical value (argument) and paste it into a new column at the end of a dataset (a separate .txt file), and 2) execute a different .ksh script.
I envision calling this script at the Unix prompt, with the input value added thereafter. Something like, "paste_and_run.ksh 58", where 58 would populate a new, final (un-headered) column in an existing dataset (specifically, it'd populate the 77th column).
To be perfectly honest, I'm not even sure where to start with this, so any input would be very appreciated. Apologies for the lack of code within the question. Please let me know if I can offer any more detail, and thank you for taking a look.
I have found the answer: the "nawk" command.
TheNumber=$3
PE_Infile=$1
Where the above variables correspond to the third and first arguments from the command line, respectively. "PE_Infile" represents the file (with full path) to be manipulated, and "TheNumber" represents the number to populate the final column. Then:
nawk -F"|" -v TheNewNumber=$TheNumber '{print $0 "|" TheNewNumber/10000}' $PE_Infile > $BinFolder/Temp_Input.txt
Here, the -F"|" dictates the delimiter, and the -v dictates what is to be added. For reasons unknown to myself, the declaration of a new varible (TheNewNumber) was necessary to perform the arithmetic manipulation within the print statement. print $0 means that the whole line would be printed, while tacking the "|" symbol and the value of the command line input divided by 10000 to the end. Finally, we have the input file and an output file (Temp_PE_Input.txt, within a path represented by the $Binfolder variable).
Running the desired script afterward was as simple as typing out the script name (with path), and adding corresponding arguments ($2 $3) afterward as needed, each separated by a space.

Exporting SAS DataSet on to UNIX as a text file....with delimiter '~|~'

I'm trying to export a SAS data set on to UNIX folder as a text file with delimiter as '~|~'.
Here is the code I'm using....
PROC EXPORT DATA=Exp_TXT
OUTFILE="/fbrms01/dev/projects/tadis003/Export_txt_OF_New.txt"
DBMS=DLM REPLACE;
DELIMITER="~|~";
PUTNAMES=YES;
RUN;
Here is the output I'm getting on UNIX.....Missing part of delimiter in the data but getting whole delimiter in variable names....
Num~|~Name~|~Age
1~A~10
2~B~11
3~C~12
Any idea why I'm getting part of delimiter in the data only????
Thanks,
Sam.
My guess is that PROC EXPORT does not support using multiple character delimiters. Normally, column delimiters are just a single character. So, you will probably need to write your own code to do this.
PROC EXPORT for delimited files generates plain old SAS code that is then executed. You should see the code in the SAS log, from where you can grab it and alter it as needed.
Please see my answer to this other question for a SAS macro that might help you. You cannot use it exactly as written, but it should help you create a version that meets your needs.
The problem is referenced on the SAS manual page for the FILE statement
http://support.sas.com/documentation/cdl/en/lestmtsref/63323/HTML/default/viewer.htm#n15o12lpyoe4gfn1y1vcp6xs6966.htm
Restriction:Even though a character string or character variable is accepted, only the first character of the string or variable is used as the output delimiter. The FILE DLM= processing differs from INFILE DELIMITER= processing.
However, there is (as of some version, anyhow) a new statement, DLMSTR. Unfortunately you can't use DLMSTR in PROC EXPORT, but if you can't easily write the variables out, you can generate the log from a PROC EXPORT and paste it into your program and modify DELIMITER to DLMSTR. You could even dynamically do so - use PROC PRINTTO to generate a file with the log, then read in that file, parse out the line numbers and the non-code, change DELIMITER to DLMSTR, and %include the code.
Since you are using unix, why not make use of unix tools to fix this?
You can call the unix command from your sas program with the X statement:
http://support.sas.com/documentation/cdl/en/hostunx/61879/HTML/default/viewer.htm#xcomm.htm
after your export, use sed to fix the file
PROC EXPORT DATA=Exp_TXT
OUTFILE="/fbrms01/dev/projects/tadis003/Export_txt_OF_New.txt"
DBMS=DLM REPLACE;
DELIMITER="~";
PUTNAMES=YES;
RUN;
X sed 's/~/~|~/g' /fbrms01/dev/projects/tadis003/Export_txt_OF_New.txt > /fbrms01/dev/projects/tadis003/Export_txt_OF_New_v2.txt ;
It might take tweaking depending on your unix, but this works on AIX. Some versions of sed can use the -i flag to edit in place so you don't have to type out the filename twice.
It is a much simpler and easier single-line solution than a big macro.

Resources