I'm new to R, and new to this forum. I've searched but cannot easily find an answer to this question:
I have numbers of cases of a disease by week according to location, stored in a .csv file with variable names cases.wk24, cases.wk25, etc. I also have population for each location, and want to generate incidence rates (# cases/population) for each of the locations.
I would like to write a loop that generates incidence rates by location for each week, and stores these in new variables called "ir.wk24", "ir.wk25", etc
I am stuck at 2 points:
is it possible to tell R to run a loop if it comes across a variable that looks like "cases.wk"? In some programmes, one would use a star - cases.wk*
How could I then generate the new variables with sequential naming and store these in the dataset?
I really appreciate any help on this - been stuck with internet searches all day!
thanks
x <- data.frame(case.wk24=c(1,3),case.wk25=c(3,2), pop=c(7,8))
weeks <- 24:25
varnames <- paste("case.wk", weeks, sep="")
ir <- sapply(varnames,FUN=function(.varname){
x[,.varname]/x[,"pop"]
})
ir <- as.data.frame(ir)
names(ir) <- paste("ir.wk", weeks, sep="")
x <- cbind(x,ir)
x
Related
I'm learning R by using it on one project where I need to extract unique paths from logs.
Now, My workaround (lower) part of the code work, but I had to split the log into two files and perform grouping on them separately, while I tried the same on variables, I was getting all the data in all three path counts.
Can someone point me to what is wrong in the first approach, as I doubt that writing physically files to a disk is intended way?
a = read.csv('download-report-06-10-2017.csv')
yesterdays_data <- a[grepl("2017-10-05", a$Download.Time), ]
todays_data <- a[grepl("2017-10-06", a$Download.Time), ]
write.csv(yesterdays_data, "yesterdays.csv")
write.csv(todays_data, "todays.csv")
path_count <- as.data.frame(table(a$Path))
path_count_today <- as.data.frame(table(todays_data$Path))
path_count_yday <- as.data.frame(table(yesterdays_data$Path))
#### path_count, path_count_today & path_count_yday contain the same values and I expect them to be different ???
yd = read.csv('yesterdays.csv')
td = read.csv('todays.csv')
path_count_td <- as.data.frame(table(td$Path))
path_count_yd <- as.data.frame(table(yd$Path))
#### path_count_td and path_count_yd are different, as I'd expect in upper three variables
Hi and thanks in advance.
So I'm trying very hard to get a working forecasting function going in R, but I'm having no luck.
Here's what I need to accomplish:
1) Extract 2 sets of data from a single txt file (one being a value of quantity, and the other being time). The amount of the records must be able to vary (e.g. 4 quantity values in 4 different times, or 5, or 6 etc)
For this I have tried to use a data frame, but the later forecasting function won't accept it for some reason.
2) Put the data into a forecasting function in order to generate new 'forecasted' data (Btw, the method of forecasting that I need to use does not matter at all [I only can't use a mean forecast as it is too simple], but something like a naive or an rwf will work just fine).
3) I want to save the newly produced 'forecasted' data to another txt file for storage.
Here is my code so far:
These lines I used to create and save my sample data (i have currently 18 records):
library(forecast)
library(ggplot2)
library(reshape2)
Quantity <- c(5,3,8,4,0,5,2,7,4,2,6,8,4,7,8,9,4,6)
Time <- c("2010/01/01", "2010/07/02", "2010/08/03", "2011/02/04", "2011/11/05", "2011/12/06", "2012/06/07", "2012/08/30", "2013/04/16", "2013/03/18", "2014/02/22", "2014/01/27", "2015/12/15", "2015/09/08", "2016/05/04", "2017/11/07", "2017/09/22", "2017/04/04")
Frame <- data.frame(Time,Quantity)
write.table(Frame,file="....path..../Frame.txt",quote=F)
I then used these lines to put that data into (hopefully) a dataframe or any container that could hold both of the above vectors data:
Frame <- read.table("....path..../Frame.txt")
Out of surety sake, I attempted to plot the data (but I don't need to) just to see if the program has properly read my data (to no avail):
There were 4 plot attempts(unfortunately none succeeded properly):
1-
plot.ts(Frame)
2-
Frame <- read.table("....path..../Frame.txt")
3-
Frame <- window(start = 2000, end = 2019)
autoplot(Frame) + autolayer(meanf(Frame,h=11),PI=FALSE, series="Means") + autolayer(naive(Frame, h=11),PI=FALSE,series="Naive") + ggtitle("Quantity vs Time") + xlab("Time") + ylab("Quantity") + guides(colour=guide_legend(title="Forecast"))
4-
plot(Frame, xlab="Time",ylab="Quantity",main="Stock Quantity vs Time",type='l')
I have yet to reach the part where I need to send the new 'forecasted' data to a new txt file, so I don't have any attempt code for that.
Any help is appreciate. Thank you.
I am trying to create minimum convex polygons for a set of GPS coordinates, each day has 32 coordinates and I want to create a MCP with 1 day,2 days,3 days... and so on worth of data. For instance in the first step I want to include rows 1-32 which I have managed:
mydata <- read.csv("file.csv", stringsAsFactors = FALSE)
mydata <- mydata[1:32, ]
Currently to select data for me to do 2 days at a time I have written:
mydata <- read.csv("file.csv", stringsAsFactors = FALSE)
mydata <- mydata[1:64, ]
Is there a way to automate adding 32 rows at each step (in a loop) rather than me running the code manually each time and changing the amount of data used manually each time?
I am very new to R so I do not know whether it is possible to do this, the way I thought would work was:
n <- 32
for (i in 1:100) {
mydata <- mydata[1:n, ]
## CREATE MCP AND STORE HOME RANGE OUTPUT
n <- n+32
}
However it is not possible to have n representing a row number but is there a way to do this?
Apologies if this is unclear but as I said I am quite new to using R and really would appreciate any help that can be given.
I am a student and have been given a project to study climate data from Giovanni (NASA). Our code is provided and we are left to 'find our way' and therefore other answers don't seem to relate to the style of code i've been given. Further to this i am a beginner in R so changing the code is very difficult.
Basically i'm trying to create a time-series plot from the following code:
## Function for loading Giovanni time series data
load_giovanni_time <- function(path){
file_data <- read.csv(path,
skip=6,
col.names = c("Date",
"Temperature",
"NA",
"Site",
"Bleached"))
file_data$Date <- parse_date_time(file_data$Date, orders="ymdHMS")
return(file_data)
}
## Creat a list of files
file.list <- list.files("./Data/courseworktimeseries/")
file.list <- as.list(paste0("./Data/courseworktimeseries/", file.list))
# for(i in file.list){
# load_giovanni_time(i)
# }
#Load all the files
all_data <- lapply(X=file.list,
FUN=load_giovanni_time)
all_data <- as.data.frame(do.call(rbind, all_data))
## Inspect the data with a plot
p <- qplot(data=all_data,
x=Date,
y=Temperature,
colour=Site,
linetype=Bleached,
geom="line")
print(p)
Now the first problem is that when the data is merged into one dataset, it changes all the dates (the starting date range is 2002-2015 and it changes to 2002-2030), which obviously ruins the plot. I found that i can stop the dates changing by deleting this code:
file_data$Date <- parse_date_time(file_data$Date, orders="ymdHMS")
However, when this is deleted, i get the following error:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
Could anyone help me get round this without editing the code too much? I feel like it's a problem with the line of the code formatting the date incorrectly or something so i imagine it's only a small problem. I'm just very much a beginner and have to implement the code within 1-2 days.
Thanks
For anyone who ever has this problem... I found the solution.
file_data$Date <- parse_date_time(file_data$Date, orders="ymdHMS")
This line of code reads in the date information from my CSV in that order. In excel my date was the other way round so if it said 30th December 2015 (30/12/2015), R would read it in as 2030-12-20, screwing up the data.
In Excel select all dates, CTRL+1 and then change format to match the date R is 'parsing'.
All done =)
I want to use ChemoSpec with a mass spectra of about 60'000 datapoint.
I have them already in one txt file as a matrix (X + 90 samples = 91 columns; 60'000 rows).
How may I adapt this file as spectra data without exporting again each single file in csv format (which is quite long in R given the size of my data)?
The typical (and only?) way to import data into ChemoSpec is by way of the getManyCsv() function, which as the question indicates requires one CSV file for each sample.
Creating 90 CSV files from the 91 columns - 60,000 rows file described, may be somewhat slow and tedious in R, but could be done with a standalone application, whether existing utility or some ad-hoc script.
An R-only solution would be to create a new method, say getOneBigCsv(), adapted from getManyCsv(). After all, the logic of getManyCsv() is relatively straight forward.
Don't expect such a solution to be sizzling fast, but it should, in any case, compare with the time it takes to run getManyCsv() and avoid having to create and manage the many files, hence overall be faster and certainly less messy.
Sorry I missed your question 2 days ago. I'm the author of ChemoSpec - always feel free to write directly to me in addition to posting somewhere.
The solution is straightforward. You already have your data in a matrix (after you read it in with >read.csv("file.txt"). So you can use it to manually create a Spectra object. In the R console type ?Spectra to see the structure of a Spectra object, which is a list with specific entries. You will need to put your X column (which I assume is mass) into the freq slot. Then the rest of the data matrix will go into the data slot. Then manually create the other needed entries (making sure the data types are correct). Finally, assign the Spectra class to your completed list by doing something like >class(my.spectra) <- "Spectra" and you should be good to go. I can give you more details on or off list if you describe your data a bit more fully. Perhaps you have already solved the problem?
By the way, ChemoSpec is totally untested with MS data, but I'd love to find out how it works for you. There may be some changes that would be helpful so I hope you'll send me feedback.
Good Luck, and let me know how else I can help.
many years passed and I am not sure if anybody is still interested in this topic. But I had the same problem and did a little workaround to convert my data to class 'Spectra' by extracting the information from the data itself:
#Assumption:
# Data is stored as a numeric data.frame with column names presenting samples
# and row names including domain axis
dataframe2Spectra <- function(Spectrum_df,
freq = as.numeric(rownames(Spectrum_df)),
data = as.matrix(t(Spectrum_df)),
names = paste("YourFileDescription", 1:dim(Spectrum_df)[2]),
groups = rep(factor("Factor"), dim(Spectrum_df)[2]),
colors = rainbow(dim(Spectrum_df)[2]),
sym = 1:dim(Spectrum_df)[2],
alt.sym = letters[1:dim(Spectrum_df)[2]],
unit = c("a.u.", "Domain"),
desc = "Some signal. Describe it with 'desc'"){
features <- c("freq", "data", "names", "groups", "colors", "sym", "alt.sym", "unit", "desc")
Spectrum_chem <- vector("list", length(features))
names(Spectrum_chem) <- features
Spectrum_chem$freq <- freq
Spectrum_chem$data <- data
Spectrum_chem$names <- names
Spectrum_chem$groups <- groups
Spectrum_chem$colors <- colors
Spectrum_chem$sym <- sym
Spectrum_chem$alt.sym <- alt.sym
Spectrum_chem$unit <- unit
Spectrum_chem$desc <- desc
# important step
class(Spectrum_chem) <- "Spectra"
# some warnings
if (length(freq)!=dim(data)[2]) print("Dimension of data is NOT #samples X length of freq")
if (length(names)>dim(data)[1]) print("Too many names")
if (length(names)<dim(data)[1]) print("Too less names")
if (length(groups)>dim(data)[1]) print("Too many groups")
if (length(groups)<dim(data)[1]) print("Too less groups")
if (length(colors)>dim(data)[1]) print("Too many colors")
if (length(colors)<dim(data)[1]) print("Too less colors")
if (is.matrix(data)==F) print("'data' is not a matrix or it's not numeric")
return(Spectrum_chem)
}
Spectrum_chem <- dataframe2Spectra(Spectrum)
chkSpectra(Spectrum_chem)