Reading files into R - r

I have problems when reading a file of performance into R. Is there any example files so I know how to name the rows/columns? The data I have is; fund name (207), year, month and performance. I have saved the file as csv but R does´t seem to understand the format. Thanks in advance! /Johanna

Use following syntax:
setwd("D:/Your Directory")
# Load CSV data
fund <- read.csv(
file = "YourFile.csv",
quote = "\"")
#Peek data
head(fund)

Related

Read .tar.gz file in R

I have compressed file like cat.txt.tar.gz, I just need to load into R and process as follows
zip <-("cat.txt.tar.gz")
data <- read.delim(file=(untar(zip,"cat.txt")),sep="\t")
but "data" is empty while running the code.Is there any way to read a file from .tar.gz
Are you sure your file is named correctly?
Usually compressed files are named cat.tar.gz, excluding the .txt.
Second, try the following code:
tarfile <- "cat.txt.tar.gz" # Or "cat.tar.gz" if that is right
data <- read.delim(file = untar(tarfile,compressed="gzip"),sep="\t")
If this doesn't work, you might need to extract the file first, and then read the extracted file.
To read in a particular csv or txt within a gz archive without having to UNZIP it first one can use library(archive) :
library(archive)
library(readr)
read_csv(archive_read("cat.txt.tar.gz", file = 1), col_types = cols(), sep="\t")
should work.

Convert Bam File to CSV

I have a bam file does anyone know how to convert a bam file to a csv file? I am trying to use R-software to open the bam file but I am not sure how to get the variables from the bam files so far I have used the below mentioned coding:
rm(list=ls())
#install bam packages
source("http://bioconductor.org/biocLite.R")
biocLite("Rsamtools",suppressUpdates=TRUE)
biocLite("RNAseqData.HNRNPC.bam.chr14",suppressUpdates=TRUE)
biocLite("GenomicAlignments",suppressUpdates=TRUE)
#load library
library(Rsamtools)
library(RNAseqData.HNRNPC.bam.chr14)
library(GenomicAlignments)
bamfile <- file.path("C:","Users","azzop","Desktop","I16-1144-01-esd_m1_CGCTCATT-AGGCGAAG_tophat2","accepted_hits.bam")
gal<-readGAlignments(bamfile)
gal
length(gal)
names(gal)
When I inserted names(gal) it gave me NULL not sure it is the correct.
I would like to convert the bam to csv and it would be easier to read the data
I would suggest converting BAM to BED and then reading BED file into R.
You can convert BAM to BED using bedtools.
This abstract code should work:
bamfile <- "C:/Users/azzop/Desktop/I16-1144-01-esd_m1_CGCTCATT-AGGCGAAG_tophat2/accepted_hits.bam"
# This code line sends command to convert BAM to BED (might take some time)
system(paste("bedtools bamtobed -i", bamfile, "> myBed.bed"))
library(data.table)
myData <- fread("myBed.bed")
Here I'm using function fread from a data.table package for a fast data read.

Convert XLS to CSV - R (Tried Rio Package)

I have a list of files in a directory which I'm trying to convert to csv, had tried rio package and solutions as suggested here
The output is list of empty CSV files with no content. It could be because the first 8 rows of the xls files have an image and few emtpy lines with couple couple of cells filled with text.
Is there any way I could skip those first 8 lines in all of xls files before converting.
Tried exploring options from openxlsx or readxls packages, any suggestions or guidance will be helpful.
Please do not mark as duplicate since I have a different problem than the one that was already answered
Maybe the following will work. At least it does for my own mock-up of an excel file with a picture in the top
library("readxl") # To read xlsx
library("readr") # Fast csv write
indata <- read_excel("~/cowexcel.xlsx", skip=8)
write_csv(indata, path="cow.csv")
If you are running this for several files then combine it into a function. Note that the function below does no checking and might overwrite existing csv files
convert_excel_to_csv <- function(name) {
indata <- read_excel(name, skip=8)
write_csv(indata, path=paste0(tools::file_path_sans_ext(name), ".csv"))
}
Although I was not able to do it with rio to convert, I read it as xls and wrote it back as csv using below code. Testing worked fine, Hope it works without glitch in implementation.
files <- list.files(pattern = '*.xls')
y=NULL
for(i in files ) {
x <- read.xlsx(i, sheetIndex = 1, header=TRUE, startRow=9)
y= rbind(y,x)
}
dt <- Sys.Date()
fn<- paste("path/",dt,".csv",sep="")
write.csv(y,fn,row.names = FALSE)

Save a lot of file excel as rda using R

I have 1000 file excel with the name "1.xlsx" "2.xlsx" ... "1000.xlsx". Then how can i write a loop to save them as "1.rda" "2.rda" ... "1000.rda" without using this code 1000 times
j1 <- read.xlsx("1.xlsx",1)
save(j1, file = "j1.rda")
Thanks a lot
Does this work?
library(tidyverse)
xlsx_to_rda <- function(inputname, outputname){
save(read.xlsx(inputname,1), file = outputname)
}
walk2(paste0(1:1000, ".xlsx"),
paste0(1:1000, ".rda"),
xlsx_to_rda)
By the way rds would be a better file format, because it stores just one r object.

How to read a zip file in R and iterate through each .txt file to convert it into .cvs?

I have a zip file having .txt documents. I want to unzip the file in R and convert the text documents into .csv so that I can use it for further analysis.
Can I give the header names while converting?
Further I also want to iterate the process by writing a function to read each above converted .csv file and generate basic graphs from the data. Is it feasible to do this in R?
For instance lets consider zip file name as 'data.zip' having 5 text files(1.txt, 2.txt, 3.txt, 4.txt, 5.txt). Each text file has log information on a single row with IP, date and time.
111.999.88.80 - - [27/Mar/2017:00:03:16 -0600] "HEAD / HTTP/1.1"
Your answers will be of great help.
Thanks in advance!
I create a reproducible sample.
And think this may solve your problem.
You can download the sample zip file I created from here.
Attached is the full codes.
## Clean Memory
rm(list=ls())
## Set path for your working location
setwd("D:/blah")
## unzipped it the file
unzip("D:/blah/text.zip")
## Check file in the zipped file
list.files()
## Read the file
temp = list.files(pattern="*.txt")
There is options here. I think what you want is the second one which combines the two files in the sample and merge them into one.
## Read the file as list
myfiles= lapply(temp, read.delim)
## Read the file all together
myfiles = do.call("rbind", lapply(temp, function(x) read.table(x, stringsAsFactors = FALSE,header = TRUE)))
Make sure to adjust the header setting if needed.
Alrighty, good luck.

Resources