Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a block of text that looks like this: "gatcctccatatacaacggtatctccacctcaggtttagatctca" and it goes on like that for 5000 characters.
I want to import it into R as a character vector. Should I put it in a .rtf file and try to import that file? And what code do I use in order to import it?
Your problem isn't really about reading the data into R, it's splitting up the string into individual characters.
Reading it in using either of the answers posted:
v <- readLines("your_file.txt")
v <- "gatcc...."
Splitting it up, using strsplit:
v <- strsplit(v, "")[[1]]
If you need to copy the text anyway, simply copy it directly into R:
v <- "gatcctccatatacaacggtatctccacctcaggtttagatctca"
I would save it as a text file and then load it in using the readLines() functions:
character_data <- "your_file.txt"
v<-readLines(character_data)
This is a little bit more complicated than copying and pasting but has the advantage of being reproducible for another person if they want to run the code as well as making it easy to change the string later on.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
As I'm dealing with a huge dataset I had to split my data into different buckets. Thus, I want to save some interim results in a csv to recall it later. However, my datafile contains some columns with lists, which according to R can not be exported (see snapshot). Do you guys know a simple way for a R newbie to make this work?
Thank you so much!
I guess the best way to solve your problem is switching to a more apropriate file format. I recomend using write_rds() from the readr package, which creates .rds files. The files you create with readr::write_rds('your_file_path') can be read in with readr::read_rds('your_file_path').
The base R functions are saveRDS() and readRDS() and the functions mentioned earlier form the readr are just wrappers with some convience features.
Just right click, then choose new csv to the folder where you want to save your work. Then set the separator of the csv to a comma.
Input all data in column form. You can later make it a matrix in your R program.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have more than 400 image files in my loacl directory.I want to read these images in r for passing it through XG boost algorithm..My two tries(codes) are is given below
library("EBImage")
img <- readImage("/home/vishnu/Documents/XG_boost_R/Data_folder/*.jpg")
and
library(jpeg)
library(biOps)
myjpg <- readJpeg("/home/vishnu/Documents/XG_boost_R/Data_folder/*.jpg")
It is a bit hard to guess what you want to do exactly, but one way to accomplish loading a lot of files and processing them is via a for-loop like this:
files <- list.files() #create a vector with file names
for(i in 1:length(files)){#loop over file names
load(files[i]) #load .rda-file
#do some processing and save results
}
This structure is generalizable to other cases. Depending on what kind of files you want to load, you will have to replace load(files[i]) with the appropriate command, for instance load.image() from the imager package.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have the next link
[1] https://drive.google.com/open?id=0ByCmoyvCype7ODBMQjFTSlNtTzQ
This is a pdf file. The author of a paper gave the list of mutation in this format.
I need to annotate the mutation of this file.
I need a txt or TVS or VCF file to be reading by annovar.
Can you help me to convert this using R or other software in ubuntu?
In principle this is a job for tabulizer but I couldn't get it to work in this instance; I suspect the single table over so many pages confused it.
You can read it in to R as text with the pdftools package easily enough
library(pdftools)
txt <- pdf_text("selection.pdf")
Now txt is an R list, with each element of the list a character string for a single page in the original document. You might be able to do something fancy with regular expressions to convert this to more meaningful data.
However, it makes more sense to ask the original author for their data in an appropriate format. Publishing a 561 page PDF of tabular data is just nuts.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
There are a large number of data.frame (more than 50). How can I save them quickly in .csv?
write.csv()
50 lines of code, it's awful...
Help me, guys!
If I understand the many data.frames may be available in in your R session...
First create a vector with the names of the data.frames... use ls or some thing similar. Then use get to get the R object after the names (the data.frames in this case)
myfiles <- ls ()
Then
for (d in myfiles) {
current <- get (d)
write.csv (current, filename = paste0 (d, ".csv"))
}
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have very big .csv file, it's around a few GB.
I want to read first few thousand lines of it.
Is there any method to do this efficiently?
Use the nrows argument in read.csv(...)
df <- read.csv(file="my.large.file.csv",nrows=2000)
There is also a skip= parameter that tells read.csv(...) how many lines to skip before you start reading.
If your file is that large you might be better off using fread(...) in the data.table package. Same arguments.
If you're on UNIX or OS/X, you can use the command line:
head -n 1000 myfile.csv > myfile.head.csv
Then just read it in R like normal.