Remove line feed in CSV using Unix script - unix

I have a CSV file and I want to remove the all line feeds (LF or \n) which are all coming in between the double quotes alone.
Can you please provide me an Unix script to perform the above task. I have given the input and expected output below.
Input :
No,Status,Date
1,"Success
Error",1/15/2018
2,"Success
Error
NA",2/15/2018
3,"Success
Error",3/15/2018
Expected output:
No,Status,Date
1,"Success Error",1/15/2018
2,"Success Error NA",2/15/2018
3,"Success Error",3/15/2018

I can't write everything for you, as I am not sure about your system as well as which bash version that is running on it. But here are a couple of suggestions that you might want to consider.
https://www.unix.com/shell-programming-and-scripting/31021-removing-line-breaks-shell-variable.html
https://www.unix.com/shell-programming-and-scripting/19484-remove-line-feeds.html
How to remove carriage return from a string in Bash
https://unix.stackexchange.com/questions/57124/remove-newline-from-unix-variable
Remove line breaks in Bourne Shell from variable
https://unix.stackexchange.com/questions/254644/how-do-i-remove-newline-character-at-the-end-of-file
https://serverfault.com/questions/391360/remove-line-break-using-awk

Related

How to parse #{TEST TAGS} into only the Tags, eliminating current formatting?

Situation.. I have two tags defined, then I try to output them to the console. What comes out seems to be similar to an array, but I'd like to remove the formatting and just have the actual words outputted.
Here's what I currently have:
[Tags] ready ver10
Log To Console \n#{TEST TAGS}
And the result is
['ready', 'ver10']
So, how would I chuck the [', the ', ' and the '], thus only retaining the words ready and ver10?
Note: I was getting [u'ready', u'ver10'] - but once I got some advice to make sure I was running Python3 RobotFramework - after uninstalling robotframework via pip, and now only having robotframework installed via pip3, the u has vanished. That's great!
There are several ways to do it. For example, you could use a loop, or you could convert the list to a string before calling log to console
Using a loop.
Since the data is a list, it's easy to iterate over the list:
FOR ${tag} IN #{Test Tags}
log to console ${tag}
END
Converting to a string
You can use the evaluate keyword to convert the list to a string of values separated by a newline. Note: you have to use two backslashes in the call to evaluate since both robot and python use the backslash as an escape character. So, the first backslash escapes the second so that python will see \n and convert it to a newline.
${tags}= evaluate "\\n".join($test_tags)
log to console \n${tags}

Using com.opencsv.CSVReader on windows stops reading lines prematurely

I have two files that are identical except for the line ending codes. The one that uses the newline (linux/Unix)character works (reads all 550 rows of data) and the one that uses carriage return and line feed (Windows) stops returning lines after reading 269 lines. In both cases the data is read correctly up to the point where they stop.
If I run dos2unix on the file that fails, the resulting file works.
I would like to be able read CSV files regardless of their origin. If I could at least detect that the file is in the wrong format before reading part of the data that would be helpful
Even if I could tell at any time in the middle of reading the file that it was not going to work, I could output an error.
My current state of reading half the file and terminating with no error is dangerous.
The problem is that under the covers openCSV uses a BufferedReader which reads a line from the stream until it gets to the Systems line.seperator.
If you know beforehand what the line separator of the file is then in your application just do a System.setProperty("line.separator", newLine) where newLine is either "\n" or "\r\n" based on the file you are about to parse. Or you can pass that in as a parameter.
If you want to automatically detect the file character. Create a method that will take the file you want, create a BufferedReader and read a single line. If the last character is a '\r' then your system system uses "\n" but you want to set it to "\r\n". Else if line.contains("\n") returns true then you are on a system that uses "\r\n" and you want to set it to "\n". Otherwise the system and the file you are reading have compatible line feed characters.
Just note if you do change the system line feed character be sure to set it back after processing the file in case your program is processing multiple files.

Avoid blank line at end of file when using writeLines

In R: Is it possible to avoid having a blank line at the end of a text file generated by writeLines? If not, is there any other way of generating a text file from within R without having a blank line at the end?
There is no blank line.
R (correctly) ends each line with '\n' (or '\r\n' on Windows). In other words, the file consists of lines, and each line ends with a line break.
Unfortunately, there are many tools (especially on Windows) which treat such files incorrectly and display an extra line at the end. However, that’s a fault with these tools, not with R. Consequently, this shouldn’t be fixed in the R code.
As a hack to appease buggy tools, the only recourse is to set the sep argument of writeLines to the empty string, '', and insert the line breaks between lines manually (using paste).
I had exactly the same concern (different grid, though) and even your comment of the accept answer (by Konrad) did not work for me.
I found the answer here, and here is the full code:
fileConn = file("mytext.txt")
writeLines(c("line1", "line2", "line3"), sep="\n", fileConn)
#now connect to UNIX server and upload your file
library(ssh)
session=ssh_connect("user#server.com")
scp_upload(session, files="mytext.txt")
#Here is the trick, convert all the Windows extra chars to unix
ssh_exec_wait(session, command="dos2unix mytext.txt")
#Then start your Grid job
ssh_exec_wait(session, command="sbatch mytext.txt")
ssh_disconnect(session)

Exporting SAS DataSet on to UNIX as a text file....with delimiter '~|~'

I'm trying to export a SAS data set on to UNIX folder as a text file with delimiter as '~|~'.
Here is the code I'm using....
PROC EXPORT DATA=Exp_TXT
OUTFILE="/fbrms01/dev/projects/tadis003/Export_txt_OF_New.txt"
DBMS=DLM REPLACE;
DELIMITER="~|~";
PUTNAMES=YES;
RUN;
Here is the output I'm getting on UNIX.....Missing part of delimiter in the data but getting whole delimiter in variable names....
Num~|~Name~|~Age
1~A~10
2~B~11
3~C~12
Any idea why I'm getting part of delimiter in the data only????
Thanks,
Sam.
My guess is that PROC EXPORT does not support using multiple character delimiters. Normally, column delimiters are just a single character. So, you will probably need to write your own code to do this.
PROC EXPORT for delimited files generates plain old SAS code that is then executed. You should see the code in the SAS log, from where you can grab it and alter it as needed.
Please see my answer to this other question for a SAS macro that might help you. You cannot use it exactly as written, but it should help you create a version that meets your needs.
The problem is referenced on the SAS manual page for the FILE statement
http://support.sas.com/documentation/cdl/en/lestmtsref/63323/HTML/default/viewer.htm#n15o12lpyoe4gfn1y1vcp6xs6966.htm
Restriction:Even though a character string or character variable is accepted, only the first character of the string or variable is used as the output delimiter. The FILE DLM= processing differs from INFILE DELIMITER= processing.
However, there is (as of some version, anyhow) a new statement, DLMSTR. Unfortunately you can't use DLMSTR in PROC EXPORT, but if you can't easily write the variables out, you can generate the log from a PROC EXPORT and paste it into your program and modify DELIMITER to DLMSTR. You could even dynamically do so - use PROC PRINTTO to generate a file with the log, then read in that file, parse out the line numbers and the non-code, change DELIMITER to DLMSTR, and %include the code.
Since you are using unix, why not make use of unix tools to fix this?
You can call the unix command from your sas program with the X statement:
http://support.sas.com/documentation/cdl/en/hostunx/61879/HTML/default/viewer.htm#xcomm.htm
after your export, use sed to fix the file
PROC EXPORT DATA=Exp_TXT
OUTFILE="/fbrms01/dev/projects/tadis003/Export_txt_OF_New.txt"
DBMS=DLM REPLACE;
DELIMITER="~";
PUTNAMES=YES;
RUN;
X sed 's/~/~|~/g' /fbrms01/dev/projects/tadis003/Export_txt_OF_New.txt > /fbrms01/dev/projects/tadis003/Export_txt_OF_New_v2.txt ;
It might take tweaking depending on your unix, but this works on AIX. Some versions of sed can use the -i flag to edit in place so you don't have to type out the filename twice.
It is a much simpler and easier single-line solution than a big macro.

How can I insert a column in numeric comma separated input?

Hi i have as text file below
input
326783,326784,402
326783,0326784,402
503534,503535,403
503534,0503535,403
429759,429758,404
429759,0429758,404
409626,409627,405
409626,0409627,405
369917,369916,402
369917,0369916,403
i want to convert it like below
condition :
1)input file column 3 and column 1 should be be same for 326784 and 0326784 and like that so on
2)if it different like the above input file last case then it should be printed in last line
output should be
326783,326784,0326784,402
503534,503535,0503535,403
429759,429758,0429758,404
409626,409627,0409627,405
369917,369916,402
369917,0369916,403
i am using solaris platform
please help me
I don't understand the logic of your computation, but some general advice: the unix tool awk can do such computations. It understands comma-separated files and you can get it to output other comma-separated files, manipulated by your logic (which you'll have to express in awk syntax).
This is, as I understand it, the unix way to do it.
The way I'd do it (being a non-expert on awk and just mentioning it for completeness ;) would be to write a little python script.
you want to
open an input and an output file
get each line from the input file
parse the integers
perform your logic
write integers to your output file
unchecked python-like code:
f_in = open("input", "r")
f_out = open("output", "w")
for line in f_in.readlines():
ints = [int(x) for x in line.split(",")]
f_out.write("%d, %d, %d\n" % (ints[0], ints[1], ints[0]+ints[1]))
f_in.close()
f_out.close()
Here, the logic is in the f_out.write(...) line (this example would output the first, the second and the sum of both input integers)
You can check if you have a Python interpreter at hand by simply typing python and seeing what happens. If you have, save your code into something.py and start it with "python something.py"

Resources