I'm writing a program in R and I need to select variables based in a particular value of one of the variable. The program is the next:
a1961 <- base[base[,5]==1961,]
a1962 <- base[base[,5]==1962,]
a1963 <- base[base[,5]==1963,]
a1964 <- base[base[,5]==1964,]
a1965 <- base[base[,5]==1965,]
a1966 <- base[base[,5]==1966,]
a1967 <- base[base[,5]==1967,]
a1968 <- base[base[,5]==1968,]
a1969 <- base[base[,5]==1969,]
a1970 <- base[base[,5]==1970,]
a1971 <- base[base[,5]==1971,]
a1972 <- base[base[,5]==1972,]
a1973 <- base[base[,5]==1973,]
a1974 <- base[base[,5]==1974,]
a1975 <- base[base[,5]==1975,]
a1976 <- base[base[,5]==1976,]
a1977 <- base[base[,5]==1977,]
a1978 <- base[base[,5]==1978,]
a1979 <- base[base[,5]==1979,]
a1980 <- base[base[,5]==1980,]
a1981 <- base[base[,5]==1981,]
a1982 <- base[base[,5]==1982,]
a1983 <- base[base[,5]==1983,]
a1984 <- base[base[,5]==1984,]
a1985 <- base[base[,5]==1985,]
a1986 <- base[base[,5]==1986,]
a1987 <- base[base[,5]==1987,]
a1988 <- base[base[,5]==1988,]
a1989 <- base[base[,5]==1989,]
...
a2012 <- base[base[,5]==2012,]
Is there a way (like modules in SAS) in which I can avoid writing the same thing over and over again?
In general, coding/implementation questions really belong on StackOverflow. That said, my recommendation is instead of naming individual variables for each result, just throw them all into a list:
a = lapply(1961:1989, function(x) base[base[,5]==x,]
You can also use the assign command.
years <- 1961:2012
for(i in 1:length(years)) {
assign(x = paste0("a", years[i]), value = base[base[,5]==years[i],])
}
Related
I have to automate this sequence of functions:
for (i in c(15,17,20,24,25,26,27,28,29,45,50,52,55,60,62)) {
WBES_sf_angola_i <- subset(WBES_sf_angola, isic == i)
WBES_angola_i <- as_Spatial(WBES_sf_angola_i)
FDI_angola_i <- FDI_angola[FDI_angola$isic==i,]
dist_ao_i <- distm(WBES_angola_i,FDI_angola_i, fun = distGeo)/1000
rm(WBES_sf_angola_i,WBES_angola_i,FDI_angola_i)
}
As a result, I want a "dist_ao" for each i. The indexed values are to be found in the isic columns of the WBES_sf_angola and the FDI_angola datasets.
How can I embed the index in the various items' names?
EDIT:
I tried with following modification:
for (i in c(15,17,20,24,25,26,27,28,29,45,50,52,55,60,62)) {
WBES_sf_angola_i <- subset(WBES_sf_angola, isic == i)
WBES_angola_i <- as_Spatial(WBES_sf_angola_i)
FDI_angola_i <- FDI_angola[FDI_angola$isic==i,]
result_list <- list()
result_list[[paste0("dist_ao_", i)]] <- distm(WBES_angola_i,FDI_angola_i, fun = distGeo)/1000
rm(WBES_sf_angola_i,WBES_angola_i,FDI_angola_i)
}
and the output is just a list of 1 that contains dist_ao_62. Where do I avoid overwriting?
Untested (due to missing MRE) but should work:
result_list <- list()
for (i in c(15,17,20,24,25,26,27,28,29,45,50,52,55,60,62)) {
result_list[[paste0("dist_ao_", i)]] <- distm(as_Spatial(subset(WBES_sf_angola, isic == i)) , FDI_angola[FDI_angola$isic==i,], fun = distGeo)/1000
}
You could approach it this way. All resulting dataframes will be included in the list, which you can convert to a dataframe from the last line of the the code here. NOTE: since not reproducible, I have mostly taken the code from your question inside the loop.
WBES_sf_angola_result <- list() # renamed this, as it seems you are using a dataset with the name WBES_sf_angola
WBES_angola <- list()
FDI_angola <- list()
dist_ao <- list()
for (i in c(15,17,20,24,25,26,27,28,29,45,50,52,55,60,62)) {
WBES_sf_angola[[paste0("i_", i)]] <- subset(WBES_sf_angola, isic == i)
WBES_angola[[paste0("i_", i)] <- as_Spatial(WBES_sf_angola_i)
FDI_angola[[paste0("i_", i)] <- FDI_angola[FDI_angola$isic==i,]
dist_ao[[paste0("i_", i)] <- distm(WBES_angola_i,FDI_angola_i, fun = distGeo)/1000
rm(WBES_sf_angola_i,WBES_angola_i,FDI_angola_i)
}
WBES_sf_angola_result <- do.call(rbind, WBES_sf_angola_result) # to get a dataframe
Your subset data can also be accessed through list index. eg.
WBES_sf_angola_result[[i_15]] # for the first item.
I am trying to append the "matrix" class and in turn overwrite the default behaviour of "[". Code examples below:
annMatrix <- function(mat=NULL, rowAnn=NULL, colAnn=NULL) {
if(is.null(mat)) mat <- matrix(nrow=0, ncol=0)
mat <- as.matrix(mat)
if(is.null(rowAnn)) rowAnn <- data.frame(row.names=seq_len(nrow(mat)))
if(is.null(colAnn)) colAnn <- data.frame(row.names=seq_len(ncol(mat)))
rowAnn <- data.frame(rowAnn, stringsAsFactors=FALSE)
colAnn <- data.frame(colAnn, stringsAsFactors=FALSE)
stopifnot(nrow(mat)==nrow(rowAnn) & ncol(mat)==nrow(colAnn))
attr(mat, "colAnn") <- colAnn
attr(mat, "rowAnn") <- rowAnn
class(mat) <- append(class(mat), "annMatrix")
mat
}
`[.annMatrix` <- function(annMat, rowExpr=NULL, colExpr=NULL) {
stopifnot(is.valid.annMatrix(annMat))
rowExpr <- eval(substitute(list(rowExpr)), attr(annMat, "rowAnn"), parent.frame())
colExpr <- eval(substitute(list(colExpr)), attr(annMat, "colAnn"), parent.frame())
indsR <- unlist(rowExpr)
indsC <- unlist(colExpr)
if(is.null(indsR)) indsR <- seq_len(nrow(annMat))
if(is.null(indsC)) indsC <- seq_len(ncol(annMat))
attr(annMat, "rowAnn") <- attr(annMat, "rowAnn")[indsR,,drop=FALSE]
attr(annMat, "colAnn") <- attr(annMat, "colAnn")[indsC,,drop=FALSE]
annMat <- unclass(annMat)
annMat <- annMat[indsR,indsC,drop=FALSE]
class(annMat) <- append(class(annMat), "annMatrix")
annMat
}
The basic idea is to make matrix preserve it's specific attributes after subsetting.
However I am running into a problem:
How to write "[" function in such a way that it behaves differently when called with and without a comma:
annMat[i]
annMat[i,]
as the default "[" for matrices seems to do.
I was thinking to set second argument to some value by default, but the value will not change because of an added comma.
I'm pretty sure this should be really straightforward but I cannot find a solution and cannot see the answer in other questions on for loops in r. I have a dataset datDET that contains 21 data sets of different 'Gels', and I want to make a plot where I have a series from each dataset plotted altogether. I have the following code, however, I just get the error that there is an unexpected symbol in my code, which is the ] after the i. Any help solving this would be greatly appreciated! Here is my current code!
G1.dat <- datDET[datDET$Gel==1,]
G2.dat <- datDET[datDET$Gel==2,]
G3.dat <- datDET[datDET$Gel==3,]
G4.dat <- datDET[datDET$Gel==4,]
G5.dat <- datDET[datDET$Gel==5,]
G6.dat <- datDET[datDET$Gel==6,]
G7.dat <- datDET[datDET$Gel==7,]
G8.dat <- datDET[datDET$Gel==8,]
G9.dat <- datDET[datDET$Gel==9,]
G10.dat <- datDET[datDET$Gel==10,]
G11.dat <- datDET[datDET$Gel==11,]
G12.dat <- datDET[datDET$Gel==12,]
G13.dat <- datDET[datDET$Gel==13,]
G14.dat <- datDET[datDET$Gel==14,]
G15.dat <- datDET[datDET$Gel==15,]
G16.dat <- datDET[datDET$Gel==16,]
G17.dat <- datDET[datDET$Gel==17,]
G18.dat <- datDET[datDET$Gel==18,]
G19.dat <- datDET[datDET$Gel==19,]
G20.dat <- datDET[datDET$Gel==20,]
G21.dat <- datDET[datDET$Gel==21,]
library(ggplot2)
p <- ggplot(datDET, aes(x = NO3, y = Depth))
for (i in c(1:21)){
p1 <- p + geom_point(data=Gi.dat)
}
data=Gi.dat is looking for an object named Gi.dat which you don't have. If you want to be able to replace the i with the looped value, you'll have to use get and paste
data=get(paste0("G",i,".dat"))
First, my code works perfectly. I simply need to be able to call the year and seasonal components out of BestSolarData using $ with:
BestSolarData$year
BestSolarData$seasonal
I have these written at the end of my code. The year I know comes from BestYear and seasonal come from BestData in the ForLoopSine function.
Any help to be able to access the components using $?
SineFit <- function (ToBeFitted)
{
msvector <- as.vector(ToBeFitted)
y <- length(ToBeFitted)
x <- 1:y
MS.nls <- nls(msvector ~ a*sin(((2*pi)/12)*x+b)+c, start=list(a=300, b=0, c=600))
summary(MS.nls)
MScoef <- coef(MS.nls)
a <- MScoef[1]
b <- MScoef[2]
c <- MScoef[3]
x <- 1:12
FittedCurve <- a*sin(((2*pi)/12)*x+b)+c
#dev.new()
#layout(1:2)
#plot(ToBeFitted)
#plot(FittedCurve)
return (FittedCurve)
}
ForLoopSine <- function(PastData, ComparisonData)
{
w<-start(PastData)[1]
t<-end(PastData)[1]
BestDiff <- 9999
for(i in w:t)
{
DataWindow <- window(PastData, start=c(i,1), end=c(t,12))
Datapredict <- SineFit(DataWindow)
CurrDiff <- norm1diff(Datapredict, ComparisonData)
if (CurrDiff < BestDiff)
{
BestDiff <- CurrDiff
BestYear <- i
BestData <- Datapredict
}
}
print(BestDiff)
print(BestYear)
return(BestData)
}
RandomFunction <- function(PastData, SeasonalData)
{
w <- start(PastData)[1]
t <- end(PastData)[1]
Seasonal.ts <- ts(SeasonalData, st = c(w,1), end = c(t,12), fr = 12)
Random <- PastData-Seasonal.ts
layout(1:3)
plot(SeasonalData)
plot(Seasonal.ts)
plot(Random)
return(Random)
}
BestSolarData <- ForLoopSine(MonthlySolarPre2015, MonthlySolar2015)
RandomComp <- RandomFunction (MonthlySolarPre2015, BestSolarData)
acf(RandomComp)
BestSolarData$year
BestSolarData$seasonal
As far as I understand your problem, you would like to retrieve the year component of BestSolarData with BestSolarData$year. But BestSolarData is returned by ForLoopSine, which is itself named DataPredict and is returned the SineFit function. It seems to be a vector and not a data.frame, so $ cannot work here.
Your example is not reproducible and this may help you find a solution. See this post for more details.
I'm trying to make a loop to automate a lot of actions in R. The code I have looks like this:
datA <- droplevels(datSUM[datSUM$Conc=="a",])
datB <- droplevels(datSUM[datSUM$Conc=="b",])
datC <- droplevels(datSUM[datSUM$Conc=="c",])
datD <- droplevels(datSUM[datSUM$Conc=="d",])
datE <- droplevels(datSUM[datSUM$Conc=="e",])
datX <- droplevels(datSUM[datSUM$Conc=="x",])
datY <- droplevels(datSUM[datSUM$Conc=="y",])
datAf <- droplevels(datA[datA$Sex=="f",])
datAf1 <- droplevels(datAf[datAf$rep=="1",])
datAf2 <- droplevels(datAf[datAf$rep=="2",])
datAf3 <- droplevels(datAf[datAf$rep=="3",])
datAm <- droplevels(datA[datA$Sex=="m",])
datAm1 <- droplevels(datAm[datAm$rep=="1",])
datAm2 <- droplevels(datAm[datAm$rep=="2",])
datAm3 <- droplevels(datAm[datAm$rep=="3",])
So since I have to do this 7 times, it seems like making a loop for this operation is the best way to do it. Can someone help me make that? I'm new to R so please bear that in mind.
Well I will have a stab at this.
concs <- c(a='a',b='b',c='c',d='d',e='e',x='x',y='y')
sex <- c(m='m',f='f')
reps <- c(rep1='1',rep2='2',rep3='3')
# By using m='m' we can label the objects within the list, making it
# easier to navigate the final object, otherwise use:
# concs <- c('a','b','c','d','e','x','y')
# sex <- c('m','f')
# reps <- c('1','2','3')
dfs <- lapply(concs, function(x){
droplevels(datSUM[datSUM$Conc==x,])}
)
sdfs <- lapply(sex, function(x){
lapply(dfs, function(y){
droplevels(y[y$Sex==x,])}
)}
)
rsdfs <- lapply(reps, function(x){
lapply(sdfs, function(y){
lapply(y, function(z){
droplevels(z[z$rep==x,])}
)}
)}
)
There is probably a better way to do this, that may involve using more lapplys but I think this "should" do the trick.
The only downside to this method you will have to access certain objects with rsdfs[[1]][[1]][[1]] or rsdfs[['rep1']][['m']][['a']] e.t.c
And applying functions to these would in itself require a bunch of lapplys
Let me know if this helps.
This is one method to do so - I will work on a more elegant solution later.