Writing a csv file that is too large R - r

I currently saved some data as a csv file on my computer. It has 581 rows, but when I try to open the saved file on my mac, the dataframe has been altered and the numbers app from which I am looking at my csv from says some data was deleted. Is there a way to fix this? Or is there a different type of file I can save my data as that would adjust for the number of rows?
This is how I am writing the csv. I'm trying to manually add my file to a github repo after it has been saved to my computer.
write.csv(coords, 'Top_50_Distances.csv', row.names = FALSE)

Related

How to read in a file stored in linux into R?

I have a large file stored in linux. I don't want to transfer the file onto my laptop and then read into R. I was hoping I can read the large file into R without storing the file on my laptop (as my storage is nearly full). The file I want to read into R Studio is located in my university file path: /data/genes/h3/PROs_GWAS/output_PROs.bgen
The file is not a txt file but a genotype file e.g. ending is .bgen
I have tried the command below:
d = read.table( pipe ('ssh hkj7#spectre2.le.ac.uk "ls /data/genes/h3/PROs_GWAS/output_PROs.bgen"'), header = T )
However, this prompts me to a password but then an error which I am assuming is because of the read.table thinking the file is a txt file.
Error in read.table(pipe("ssh hkj7#spectre2.le.ac.uk \"ls /data/genes/h3/PROs_GWAS/output_PROs.bgen\""), :
no lines available in input
I am not sure how to get round this.
Any help will be greatly appreciated!

Read dataset.train and dataset.test in r

I am doing a project about high dimension data set, the data set is from http://archive.ics.uci.edu/, it can also be found in github (https://github.com/minghust/MaliciousExeDetect/tree/master/TrainData)
The file is called "dataset.train","tst.test". I want to read them in R.
My question is whether there are file format, called .train, and .test file. They are not csv, or txt file. How i can open it and import it in R?

Exporting Chinese characters from Excel to R

I have a file in Excel which has a column with Chinese simplified characters. When I open it in R from the corresponding CSV file I only get ?'s.
I'm afraid the problem is when exporting from Excel to CSV because when I open the CSV file on a text editor I also get ?'s.
How can I get around this?
The best way to secure your Chinese/Unicode characters is to read file from .xlsx:
library(readxl)
read_xlsx("yourfilepath.xlsx", col_types = "text")
If your file is too big to read from .xlsx, then the best way is to open Excel and split manually into multiple files.
(My experience with a laptop with 8GB RAM is to split files into 250,000 rows x 106 columns.)
If you need to read from .csv, your all windows settings/localization needs to be the same as your file, but even that does not guarantee the integrity of all your Unicode characters (eg. emojis).
(If you also need .csv for something else, then you can use the R function write.csv after you read data from .xlsx into R.)

Loading an existing .RData file into an R program

I have a .RData file. I want to do some operations on the dataframe that this file contains. Can I load this file on my R program and convert it into a dataframe? The only option I know currently is to convert the ..RData file to a csv and convert that csv into a data frame again. I am looking for a neater solution. I got this file from a friend of mine and I cannot produce the dataframe from scratch.

Taking manipulated data out of the R console and creating a csv

I have used R to remove duplicates from a csv file using the following (lda_data is my csv file name)
unique(lda_data[duplicated(lda_data),])
This works great, however I need to get the results from the console into another csv file.
What are the methods of getting manipulated data from a csv file into another new and manipulated csv file?
Thanks in advance.
Use the write.csv command:
write.csv(dataframe, "/path/filename.csv", row.names = FALSE)
That should do the trick for you.

Resources