Time Series - plot.ts() and multiple graphs - r

I've seen several threads on the error I have
cannot plot more than 10 series as "multiple"
But none really explaining (1) What's going on and (2) how to get around it if you have multiple graphs.
I have a 12 different files.
Each file is 1 row of ~240-250 data points. This is time-series data. The values range changes from file to file.
I want to make a graph that has them all on one single plot. So something like par(mfrow=(4,3)).
However, when I use my code, it gives me the above error.
for(cand in cands)
{
par(mfrow=c(4,3))
for(type in types)
{
## Construct the file name
curFile = paste(folder, cand, base, type, close, sep="")
## Read in the file
ts = read.delim(curFile, sep="\t", stringsAsFactors=FALSE, header=FALSE, row.names=NULL,fill=TRUE, quote="", comment.char="")
plot.ts(ts)
}
}

First, don't call your time series object "ts". It's like calling your dog "dog". "ts" gets used in the system, and this can lead to confusion.
Have a look at the structure of your "ts" from reading the file. From your description, is the file a single row with 240+ columns? If so, that'll be a problem too.
read.delim() is expecting a column-oriented data file, not row-oriented. You'll need to transpose it if this is the case. Something like:
my.ts = t(
read.delim(curFile, sep="\t", stringsAsFactors=FALSE,
header=FALSE, row.names=NULL,
fill=TRUE, quote="", comment.char="")
)
my.ts = ts(my.ts)

Related

Loop in R to read sequentially numbered filenames and output accordingly numbered files

I'm sure this is very simple, but I'm new to doing my own programming in R and haven't quite gotten a hang of the syntax for looping.
I have code like this:
mydata1 <- read.table("ph001.txt", header=TRUE)
# ... series of formatting and merging steps
write.table(mydata4, "ph001_anno.txt", row.names=FALSE, quote=FALSE, sep="\t")
png("manhattan_ph001.png"); manhattan(mydata4); dev.off()
png("qq_ph001.png"); qq(mydata4$P); dev.off()
The input file ph001.txt is output from a linear regression algorithm, and from that file, I need to output ph001_anno.txt, manhattan_ph001.png, and qq_ph001.png. The latter two are using the qqman package.
I have a folder that contains ph001 through ph138, and would like a loop function that reads these files individually and creates the corresponding output files for each file. As I said, I'm sure there is an easy way to do this as a loop function, but the part that's tripping me up is modifying the output filenames.
You can use the stringr package to do a lot of the string manipulation you want in order to generate your file names, like so:
f <- function(i) {
num <- str_pad(i, 3, pad = "0")
a <- str_c("ph", num, "_anno.txt")
m <- str_c("manhattan_ph", num, ".png")
q <- str_c("qq_ph", num, ".png")
# Put code to do stuff with these file names here
}
sapply(1:138, f)
In the above block of code, for each number in 1:138 you create the name of three files. You can then use those file names in calls to read.table or ggsave or whatever you want.

How to read every .csv file in R and export them into single large file

Hi so I have a data in the following format
101,20130826T155649
------------------------------------------------------------------------
3,1,round-0,10552,180,yellow
12002,1,round-1,19502,150,yellow
22452,1,round-2,28957,130,yellow,30457,160,brake,31457,170,red
38657,1,round-3,46662,160,yellow,47912,185,red
and I have been reading them and cleaning/formating them by this code
b <- read.table("sid-101-20130826T155649.csv", sep = ',', fill=TRUE, col.names=paste("V", 1:18,sep="") )
b$id<- b[1,1]
b<-b[-1,]
b<-b[-1,]
b$yellow<-B$V6
and so on
There are about 300 files like this, and ideally they will all compiled without the first two lines, since the first line is just id and I made a separate column to identity these data. Does anyone know how to read these table quickly and clean and format the way I want then compile them into a large file and export them?
You can use lapply to read all the files, clean and format them, and store the resulting data frames in a list. Then use do.call to combine all of the data frames into single large data frame.
# Get vector of files names to read
files.to.load = list.files(pattern="csv$")
# Read the files
df.list = lapply(files.to.load, function(file) {
df = read.table(file, sep = ',', fill=TRUE, col.names=paste("V", 1:18,sep=""))
... # Cleaning and formatting code goes here
df$file.name = file # In case you need to know which file each row came from
return(df)
})
# Combine into a single data frame
df.combined = do.call(rbind, df.list)

lapply r to one column of a csv file

I have a folder with several hundred csv files. I want to use lappply to calculate the mean of one column within each csv file and save that value into a new csv file that would have two columns: Column 1 would be the name of the original file. Column 2 would be the mean value for the chosen field from the original file. Here's what I have so far:
setwd("C:/~~~~")
list.files()
filenames <- list.files()
read_csv <- lapply(filenames, read.csv, header = TRUE)
dataset <- lapply(filenames[1], mean)
write.csv(dataset, file = "Expected_Value.csv")
Which gives the error message:
Warning message: In mean.default("2pt.csv"[[1L]], ...) : argument is not numeric or logical: returning NA
So I think I have 2(at least) problems that I cannot figure out.
First, why doesn't r recognize that column 1 is numeric? I double, triple checked the csv files and I'm sure this column is numeric.
Second, how do I get the output file to return two columns the way I described above? I haven't gotten far with the second part yet.
I wanted to get the first part to work first. Any help is appreciated.
I didn't use lapply but have done something similar. Hope this helps!
i= 1:2 ##modify as per need
##create empty dataframe
df <- NULL
##list directory from where all files are to be read
directory <- ("C:/mydir/")
##read all file names from directory
x <- as.character(list.files(directory,,pattern='csv'))
xpath <- paste(directory, x, sep="")
##For loop to read each file and save metric and file name
for(i in i)
{
file <- read.csv(xpath[i], header=T, sep=",")
first_col <- file[,1]
d<-NULL
d$mean <- mean(first_col)
d$filename=x[i]
df <- rbind(df,d)
}
###write all output to csv
write.csv(df, file = "C:/mydir/final.csv")
CSV file looks like below
mean filename
1999.000661 hist_03082015.csv
1999.035121 hist_03092015.csv
Thanks for the two answers. After much review, it turns out that there was a much easier way to accomplish my goal. The csv files that I had were originally in one file. I split them into multiple files by location. At the time, I thought this was necessary to calculate mean on each type. Clearly, that was a mistake. I went to the original file and used aggregate. Code:
setwd("C:/~~")
allshots <- read.csv("All_Shots.csv", header=TRUE)
EV <- aggregate(allshots$points, list(Location = allshots$Loc), mean)
write.csv(EV, file= "EV_location.csv")
This was a simple solution. Thanks again or the answers. I'll need to get better at lapply for future projects so they were not a waste of time.

How to not overwrite file in R

I am trying to copy and paste tables from R into Excel. Consider the following code from a previous question:
data <- list.files(path=getwd())
n <- length(list)
for (i in 1:n)
{
data1 <- read.csv(data[i])
outline <- data1[,2]
outline <- as.data.frame(table(outline))
print(outline) # this prints all n tables
name <- paste0(i,"X.csv")
write.csv(outline, name)
}
This code writes each table into separate Excel files (i.e. "1X.csv", "2X.csv", etc..). Is there any way of "shifting" each table down some rows instead of rewriting the previous table each time? I have also tried this code:
output <- as.data.frame(output)
wb = loadWorkbook("X.xlsx", create=TRUE)
createSheet(wb, name = "output")
writeWorksheet(wb,output,sheet="output",startRow=1,startCol=1)
writeNamedRegion(wb,output,name="output")
saveWorkbook(wb)
But this does not copy the dataframes exactly into Excel.
I think, as mentioned in the comments, the way to go is to first merge the data frames in R and then writing them into (one) output file:
# get vector of filenames
filenames <- list.files(path=getwd())
# for each filename: load file and create outline
outlines <- lapply(filenames, function(filename) {
data <- read.csv(filename)
outline <- data[,2]
outline <- as.data.frame(table(outline))
outline
})
# merge all outlines into one data frame (by appending them row-wise)
outlines.merged <- do.call(rbind, outlines)
# save merged data frame
write.csv(outlines.merged, "all.csv")
Despite what microsoft would like you to believe, .csv files are not excel files, they are a common file type that can be read by excel and many other programs.
The best approach depends on what you really want to do. Do you want all the tables to read into a single worksheet in excel? If so you could just write to a single file using the append argument to the write.csv or other functions. Or use a connection that you keep open so each new one is appended. You may want to use cat to put a couple of newlines before each new table.
Your second attempt looks like it uses the XLConnect package (but you don't say, so it could be something else). I would think this the best approach, how is the result different from what you are expecting?

data.frame object to xts object conversion in R

I'd like to convert my csv files into xts objects as efficiently as possible. I seem to be stuck though with having to first applying the read.zoo method to create a zoo objects before being able to convert it to an xts object.
gold <- read.zoo("GOLD.CSV", sep=",", format="%m/%d/%Y", header=TRUE)
Gold <- as.xts (gold, order.by=index(gold), frequency=NULL)
Is this the most efficient way of converting my initial GOLD.CSV file into an R xts object?
If it is a file, you need to read it.
So use read.zoo() as you -- but then convert rightaway:
gold <- as.xts(read.zoo("GOLD.CSV", sep=",", format="%m/%d/%Y", header=TRUE))
Ok?
You can write your own read.xts function. We would call it a wrapper function and it should go something along the lines of
read.xts <- function(x, format = "%m/%d/%Y", header = TRUE, sep = ",") {
result <- as.xts(read.zoo(x, sep = sep, format = format, header = header))
return(result)
}
read.xts(file.choose()) # select your file
Notice the arguments in function(). They are passed to the body of the function (code between curly braces). If function() arguments have values, this means that this is their default. If you assign new values (e.g. function(x = "my.file.csv", sep = "\t")), they will overwrite the defaults. The last line shows you how you can use your new function. Feel free to extend this function with the rest of the read.zoo arguments. Should you have any specific question on how to do it, don't by shy and just ask. :)
I use a few of little gems like that in my daily work. I've created a file called workhorse.R and I load it (e.g. source("d:/workspace/workhorse.R")) whenever I need any of the little functions.

Resources