Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 days ago.
Improve this question
This is a broader/more-general question, but I find reading .sav files in R to be a nightmare. I use the haven package, but often run into errors due to the format in which .sav files are read (as an ex., of many, it refuses to let me coerce the dbl+lbl format into a numeric data frame).
What I typically do to get around this annoying process is to just save the .sav file as a .csv, then re-read it into R, but I'm sure there's got to be a better way, right?!
n/a this is a more-general question
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
Unfortunately I have to deal with a data file with .int file format. This has the effect of littering any search results with unrelated information about integers.
I can't figure out how to open this file in R. I have an example with the Julia language, shown below:
filename = "mnist_train.int"
open(filename) do f
...
end
But when I try to search for a similar function in R, I either find results about opening excel files, results for other languages, or results that deal with integers. Could someone please point me to some resources for dealing with this filetype?
Because I am not sure about what the content type, guess you trying to open a binary file format.
You can have a look at ?readBin
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have the next link
[1] https://drive.google.com/open?id=0ByCmoyvCype7ODBMQjFTSlNtTzQ
This is a pdf file. The author of a paper gave the list of mutation in this format.
I need to annotate the mutation of this file.
I need a txt or TVS or VCF file to be reading by annovar.
Can you help me to convert this using R or other software in ubuntu?
In principle this is a job for tabulizer but I couldn't get it to work in this instance; I suspect the single table over so many pages confused it.
You can read it in to R as text with the pdftools package easily enough
library(pdftools)
txt <- pdf_text("selection.pdf")
Now txt is an R list, with each element of the list a character string for a single page in the original document. You might be able to do something fancy with regular expressions to convert this to more meaningful data.
However, it makes more sense to ask the original author for their data in an appropriate format. Publishing a 561 page PDF of tabular data is just nuts.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I'm curently using R package data.table to process big datasets.
I'm wondering if there is a difference between the syntax
DT[,v]
and the syntax :
DT$v
if DT is my data.table object and v the variable I want to select.
I know that the dollar sign is usually used for data frames and that [,v] is always used in data.table examples. However they both work and seem to give (in my experience with 5million rows) similar times to execute.
Do you know if they are processed differently and if one is more efficient when processing even huger datasets ?
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have an R dataset (an .Rdata file) that I need to convert to either SAS (.sas7bdat or .xpt) or SPSS (.sav or .por). How can I import this dataset into SAS or SPSS?
If you want to use this in SPSS, consider using the STATS_GETR extension command. It can read R workspace or data files and map appropriate elements directly to an SPSS dataset. This extension command is available from the SPSS Community (www.ibm.com/developerworks/spssdevcentral) website or, for Statistics 22, it can be installed via the Utilities menu.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have very big .csv file, it's around a few GB.
I want to read first few thousand lines of it.
Is there any method to do this efficiently?
Use the nrows argument in read.csv(...)
df <- read.csv(file="my.large.file.csv",nrows=2000)
There is also a skip= parameter that tells read.csv(...) how many lines to skip before you start reading.
If your file is that large you might be better off using fread(...) in the data.table package. Same arguments.
If you're on UNIX or OS/X, you can use the command line:
head -n 1000 myfile.csv > myfile.head.csv
Then just read it in R like normal.