Does anyone know how to easily convert a .csv / or a dataframe (inserted from that .csv into R) to a comma delimited .txt file?
For my analysis / the package to work I need a format like this:
cell1, cell2, cell3, ...
So comma then space between every cell of every row in my dataset.
In R I've tried:
write.table(df,"df.txt",sep=", ", na = "",row.names=FALSE, col.names = FALSE, append = FALSE)
However, the output looks like this:
"cell1", "cell2", "cell3", "", "", "",
First of all, the "" are an issue that I need to get rid of.
Secondly, every row has a different length, which means that the empty cells in shorter rows get the "" which is also a problem trying to run this in the package.
Other people using the package told me to use GNU emacs to convert it to the .txt file I need but I have no experience with this editor and it seems rather complicated to learn just for the conversion of this one (big) file
Cheers!
Edit:
Figured out how to get rid of the "" but still got the issue of the empty cells being separated by commas
write.table(test,"test2.txt",sep=", ", na = "", eol = "\r\n", row.names=FALSE, col.names = FALSE, append = FALSE, quote = FALSE)
Related
I am trying to import multiple CSV files in a for loop. Iteratively trying to solve the errors the code produced I go to the below to do this.
for (E in EDCODES) {
Filename <- paste("$. Data/2. Liabilities/",
E,
sep="")
Framename <- gsub("\\..*",
"",
E)
assign(Framename,
read.csv(Filename,
header = TRUE,
sep = ",",
stringsAsFactors = FALSE,
na.strings = c("\"ND",
"ND,5",
"5\""),
colClasses = c("BAA35" = "double"),
encoding = "UTF-8",
quote = ""))}
First I realized that the code does not always recognize the most important column "BAA35" as numeric, so I added the colClasses argument. Then I realized that the data has multiple versions of "NA", so I added the na.strings argument. The most common NA value is "ND, 5", which contains the separator ",". So if I add the na.strings argument as defined above I get a lot of EOF within quoted string warnings. The others are also versions of "ND, [NUMBER]" or "ND, 4, [YYYY-MM]".
If I then try to treat that issue with the most common recommendation I could find, adding quote = "" I just end up with a more columns than column names issue.
The data has 78 columns, so I don't believe posting it here will display in a usable way.
Can somebody recommend any solution for how I can reliable import this column as a numeric value and have R recognize NAs in the data correctly?
I think the issue might be that the na.strings contain commas and in some cases the ND,5 is read as one column with ND and one with a 5 and in other cases it's seen as the na.string. Any way to tell R to not split "ND,5" into two columns?
I have a problem with one task where I have to load some data set, and I have to make sure that missing values are read in properly and that column names are unambiguous.
The format of .txt file:
At the end, data set should contain only country column and median age.
I tried using read.delim, precisely this chunk:
rawdata <- read.delim("rawdata_343.txt", sep = "", stringsAsFactors = FALSE, header = TRUE)
And when I run it, I get this:
It confuses me that if country has multiple words (Turks and Caicos Islands) it assigns every word to another column.
Since I am still a beginner in R, any suggestion would be very helpful for me. Thanks!
Three points to note about your input file: (1) the first two lines at the top are not tabular and should be skipped with skip = 2, (2) your column separators are tabs and this should be specified with sep = "\t", and (c) you have no headers, so header = FALSE. Your command should be: -
rawdata <- read.delim("rawdata_343.txt", sep = "\t", stringsAsFactors = FALSE, header = FALSE, skip = 2)
UPDATE: A fourth point is that the first column includes row numbers, so row.names = 1. This also addresses the follow-up comment.
rawdata <- read.delim("rawdata_343.txt", sep = "\t", stringsAsFactors = FALSE, header = FALSE, skip = 2, row.names = 1)
It looks like your delimiter that you are specifying in the sep= argument is telling R to consider spaces as the column delimiter. Looking at your data as a .txt file, there is no apparent delimiter (like commas that you would find in a typical .csv). If you can put the data in a tabular form in something like a .csv or .xlsx file, R is much better at reading that data as expected. As it is, you may struggle to get the .txt format to read in a tabular fashion, which is what I assume you want.
P.s. you can use read.csv() if you do end up putting the data in that format.
Am trying to write a summary onto excel using text file. Here is the example of how it looks like in R output.
[1] "\n\nDuring the post-intervention period, the response variable had an average value of approx.196.08K. In the absence of an intervention, we would have expected an average response of 199.41K.
However, when i write this to a text file using write.table this is how the text file looks like
*During the post-intervention period, the response variable had an average value of approx.196.08K.
In the absence of an intervention, we would have expected an average response of 199.41K.*
Here is the code am using for this.
write.table(analysis_text, file = "analysis_text.txt", sep = "",
row.names = TRUE, col.names = NA)
Is there anyway i can ignore "/n" while writing into txt file in R?
If you want to ignore "\n", maybe you can do some pre-processing on your analisys_text, i.e., gsub("\n","",analisys_text), which removes "\n" in your texts.
In this sense, you code should be like
write.table(gsub("\n","",analisys_text),
file = "analysis_text.txt",
sep = "",
row.names = TRUE,
col.names = NA)
Looking at the write.table help, the function can also take an eol argument, which defaults to \n.
Change that to eol="" and you should be fine.
One of the columns in my dataframe contains semicolon(;) and when I try to download the dataframe to a csv using fwrite, it is splitting that value into different columns.
Ex: Input : abcd;#6 After downloading it becomes : 1st column : abcd,
2nd column: #6
I want both to be in the same column.
Could you please suggest how to write the value within a single column.
I am using below code to read the input file:
InpData <- read.table(File01, header=TRUE, sep="~", stringsAsFactors = FALSE,
fill=TRUE, quote="", dec=",", skipNul=TRUE, comment.char="")
while for writing:
fwrite(InpData, File01, col.names=T, row.names=F, quote = F, sep="~")
You didn't give us an example, but it is possible you need to use a different separator than ";"
fwrite(x, file = "", sep = ",")
sep: The separator between columns. Default is ",".
If this simple solution does not work, we need the data to reproduce your problem.
This is a very simple issue and I'm surprised that there are no examples online.
I have a vector:
vector <- c(1,1,1,1,1)
I would like to write this as a csv as a simple row:
write.csv(vector, file ="myfile.csv", row.names=FALSE)
When I open up the file I've just written, the csv is written as a column of values.
It's as if R decided to put in newlines after each number 1.
Forgive me for being ignorant, but I always assumed that the point of having comma-separated-values was to express a sequence from left to right, of values, separated by commas. Sort of like I just did; in a sense mimicking the syntax of written word. Why does R cling so desperately to the column format when a csv so clearly should be a row?
All linguistic philosophy aside, I have tried to use the transpose function. I've dug through the documentation. Please help! Thanks.
write.csv is designed for matrices, and R treats a single vector as a matrix with a single column. Try making it into a matrix with one row and multiple columns and it should work as you expect.
write.csv(matrix(vector, nrow=1), file ="myfile.csv", row.names=FALSE)
Not sure what you tried with the transpose function, but that should work too.
write.csv(t(vector), file ="myfile.csv", row.names=FALSE)
Here's what I did:
cat("myVar <- c(",file="myVars.r.txt", append=TRUE);
cat( myVar, file="myVars.r.txt", append=TRUE, sep=", ");
cat(")\n", file="myVars.r.txt", append=TRUE);
this generates a text file that can immediately be re-loaded into R another day using:
source("myVars.r.txt")
Following up on what #Matt said, if you want a csv, try eol=",".
I tried with this:
write.csv(rbind(vector), file ="myfile.csv", row.names=FALSE)
Output is getting written column wise, but, with column names.
This one seems to be better:
write.table(rbind(vector), file = "myfile.csv", row.names =FALSE, col.names = FALSE,sep = ",")
Now, the output is being printed as:
1 1 1 1 1
in the .csv file, without column names.
write.table(vector, "myfile.csv", eol=" ", row.names=FALSE, col.names=FALSE)
You can simply change the eol to whatever you want. Here I've made it a space.
You can use cat to append rows to a file. The following code would write a vector as a line to the file:
myVector <- c("a","b","c")
cat(myVector, file="myfile.csv", append = TRUE, sep = ",", eol = "\n")
This would produce a file that is comma-separated, but with trailing commas on each line, hence it is not a CSV-file.
If you want a real CSV-file, use the solution given by #vamosrafa. The code is as follows:
write.table(rbind(myVector), file = "myfile.csv", row.names =FALSE, col.names = FALSE,sep = ",", append = TRUE)
The output will be like this:
"a","b","c"
If the function is called multiple times, it will add lines to the file.
One more:
write.table(as.list(vector), file ="myfile.csv", row.names=FALSE, col.names=FALSE, sep=",")
fwrite from data.table package is also another option:
library(data.table)
vector <- c(1,1,1,1,1)
fwrite(data.frame(t(vector)),file="myfile.csv",sep=",",row.names = FALSE)