I have the following code to read a file and save it as a csv file, I remove the first 7 lines in the text file and then the 3rd column as well, since I just require the first two columns.
current_file <- paste("Experiment 1 ",i,".cor",sep="")
curfile <- list.files(pattern = current_file)
curfile_data <- read.table(curfile, header=F,skip=7,sep=",")
curfile_data <- curfile_data[-grep('V3',colnames(curfile_data))]
write.csv(curfile_data,curfile)
new_file <- paste("Dev_C",i,".csv",sep="")
new_file
file.copy(curfile, new_file)
The curfile thus hold two column variables V1 and V2 along with the observation number column in the beginning.
Now when I use file.copy to copy the contents of the curfile into a .csv file and then open the new .csv file in Excel, all the data seems to be concatenated and appear in a single column, is there a way to show each of the individual columns separately? Thanks in advance for your suggestions.
The data in the .txt file looks like this,
"","V1","V2","V3"
"1",-0.02868862,5.442283e-11,76.3
"2",-0.03359281,7.669754e-12,76.35
"3",-0.03801883,-1.497323e-10,76.4
"4",-0.04320051,-6.557672e-11,76.45
"5",-0.04801207,-2.557059e-10,76.5
"6",-0.05325544,-9.986231e-11,76.55
You need to use Text to columns feature in Excel, selecting comma as a delimiter. The position of this point in menu depends on version of Excel you are using.
Related
I have a series of massive data files that range in size from 800k to 1.4M rows, and one variable in particular has a set length of 12 characters (numeric data but with leading zeros where other the number of non-zero digits is fewer than 12). The column should look like this:
col
000000000003
000000000102
000000246691
000000000042
102851000324
etc.
I need to export these files for a client to a CSV file, using R. The final data NEEDS to retain the 12 character structure, but when I open the CSV files in excel, the zeros disappear. This happens even after converting the entire data frame to character. The code I am using to do this is as follows.
df1 %>%
mutate(across(everything(), as.character))
##### I did this for all data frames #####
export(df1, "df1.csv")
export(df2, "df2.csv")
....
export(df17, "df17.csv)
I've read a few other posts that say this is an excel problem, and that makes sense, but given the number of data files and amount of data, as well as the need for the client to be able to open it in excel, I need a way to do it on the front end in R. Any ideas?
Yes, this is definitely an Excel problem!
To demonstrate, In Excel enter your column values save the file as a CSV value and then re-open it in Excel, the leading zeros will disappear.
One option is add a leading non-numerical character such as '
paste0("\' ", df$col)
Not a great but an option.
A slightly better option is to paste Excel's Text function to the character string. Then Excel will process the function when the function is opened.
df$col <- paste0("=Text(", df$col, ", \"000000000000\")")
#or
df$col <- paste0("=\"", df$col, "\"")
write.csv(df, "df2.csv", row.names = FALSE)
Of course if the CSV file is saved and reopened then the leading 0 will again disappear.
Another option is to investigate saving the file directly as a .xlsx file with the "writexl", or "XLSX" or similar package.
I have an Excel with data, like this (but then N=1.000):
p_evar7_CO.main.
p_evar7_CP.acquistion..sign_up.start
p_evar7_CP.main.
p_evar7_CP.main.facial_stylers00
I want to put it in a vector, but with simple copy/pasting it goes wrong. I want this as result:
Excel <- c("p_evar7_CO.", "p_evar7_CP.acquistion..sign_up.start", "p_evar7_CP.main.","p_evar7_CP.main.facial_stylers00")
So basically: How can I paste a big data file into R, and automatically separate it with a Comma and Quote each row?
EDIT I don't want to load in an Excel data file, but only pasting columns names (and have them as a vector).
Looks like you could do a simple scan().
scan(file, what = "")
where file is your file name as a character string. If you are working with copied text, then you can enter "clipboard" as the file name.
scan("clipboard", what = "")
For example, I copied the file text from your question for the following code.
scan("clipboard", what="")
# Read 4 items
# [1] "p_evar7_CO.main." "p_evar7_CP.acquistion..sign_up.start"
# [3] "p_evar7_CP.main." "p_evar7_CP.main.facial_stylers00"
Apologies if this is a trivial question. I saw others like it such as: How can I turn a part of the filename into a variable when reading multiple text files into R? , but I still seem to be having some trouble...
I have been given 50000 .txt files. Each file contains a single observation (a single row of data) with exactly 12 variables (number of columns). The name of each .txt file is fairly regular. Specifically, each .txt file has a code at the end indicating the type of observation across three dimensions. An example of this code is 'VL-VL-NE' or 'VL-M-N' or 'H-H-L' (not including the apostrophes). Therefore, an example of a file name could be 'I-love-using-R-20_01_2016-VL-VL-NE.txt'.
My problem is that I want to include this code at the end of the .txt file in the actual vector itself when I import into R, i.e., I want to add three more variables (columns) at the end of the table corresponding to the three parts of code at the end of the file name.
Any help would be greatly appreciated.
Because you have exactly the same number of columns in each file, why don't you import them into R using a loop that looks for all .txt files in a particularly directory?
df <- c()
for (x in list.files(pattern="*.txt")) {
u<-read.csv(x, skip=6)
u$Label = factor(x) #A column that is the filename
df <- rbind(df,u)
}
You'll note that the file name itself becomes a column. Once everything is into R, it should be fairly easy to use a regex function to extract the exact elements you need from the file name column (df$Label).
I'm new to R, and I wonder how to read a csv file and assign the value from the csv file to a variable? For example I have a csv file and I want to assign filename and filepath to R variables. I know how to read csv into R variable with
mydata <- read.csv("testing.csv")`
But how to assign value from Filename which is 'globaldata.txt' and Filepath which is 'E:\Test\Global' to r variable
variable value
Filename globaldata.txt
Filepath E:\Test\Global
it's safe to use read.table and define the class for each variable in the argument, see the help file ?read.table
mydata <- read.table("testing.csv", colClasses = c("character", "character"))
The return value mydata will be a data frame, and u can simply extract what you want using the $ sign
e.g.
value1 <- mydata$column1
etc.
You can do the following :
Filename<-"globaldata.csv" # if this is a csv and not a .txt file
Filepath<-"E:/Test/Global/" # if you are on Windows you need to use "/"
which then allows you to do (if this is what you want)
mydata<-read.csv(paste0(Filepath,Filename))
EDIT
If I understand correctly you have a csv file named testing.csv with two columns: one with Filenames and one with Filepaths.
In that case when you have mydata<-read.csv("testing.csv")you have a dataframe with two columns. To access the first one you use mydata[,1] and for the second (Filepath) : mydata[,2]. If you want the Filename of the third entry in the file you then use mydata[3,1](before the comma is the row, after is the column)
I hope this is what you are looking for, otherwise I'm afraid I misunderstood you again. Having a look at the csv file will help to better understand the question
I have a folder in my working directory containing 15 delimited text files (1686*2 matrix). i want to create a list of the files then make R access the files so I can import each into R. I tried using the dir() function but the list seems to capture the names of the files as character and cant access the content of the files. please help me with this. Thanks
dir() just gives you a vector with the files. You need to use read.table() and loop through the directory e.g. as follows:
# this is subdirectory where I put the test files
setwd("./csv")
# this gets a vector containing all the filenames
myfiles<-dir()
# this loops through the length of the vector (i.e. number of files)
for(i in 1:length(myfiles)){
# this reads the data (my test file has only 4 columns, no header and is csv (use "\t" for tab))
fileData<-read.table(file=myfiles[i],header=FALSE,sep=",",col.names=c("A","B","C","D"))
# if the target table doesn't exist create it, else append
ifelse(exists("targetTable"),targetTable<-rbind(targetTable,fileData),targetTable<-fileData)
}
head(targetTable)
Hope that helps!