I am trying to use for to create multiple objects from for, just example (not exact):
l_gr <- list (1:10, 11:20, 21:30)
for (i in 1:length(l_gr)){
grp <- NULL
grp[[i]] <- mean(l_gr[[i]])
}
This is not what I am expecting, rather I need to output multiple objects (of different class) however the name is different with i level for example: here grp1, grp2, grp3.
Each of these object has output of the function for particular i list. Sorry for simple question.
Edits: response to provide specific example:
install.packages("onemap")
require(onemap)
data(example.out)
twopts <- rf.2pts(example.out)
all.data <- make.seq(twopts,"all")
link_gr <- group(all.data)
link_gr$n.groups
starts the loop
# without loop:
# for 1
grp1 <- make.seq(link_gr, 1)
grp1.od <- order.seq(input.seq=grp1, n.init = 5, subset.search = "twopt",
twopt.alg = "rcd", THRES = 3, draw.try = TRUE, wait = 1, touchdown=TRUE)
# for 2
grp2 <- make.seq(link_gr, 2)
grp2.od <- order.seq(input.seq=grp2, n.init = 5, subset.search = "twopt",
twopt.alg = "rcd", THRES = 3, draw.try = TRUE, wait = 1, touchdown=TRUE)
same process report for 1:1:link_gr$n.groups
So I want create a for loop and output objects:
for (i in 1:link_gr$n.groups){
grp <- NULL
grp[i] <- make.seq(link_gr, i)
grp[i].od <- order.seq(input.seq=grp[i], n.init = 5, subset.search = "twopt",
twopt.alg = "rcd", THRES = 3, draw.try = TRUE, wait = 1, touchdown=TRUE)
}
Note that your for loops are wrong. If you set grp <- NULL within the loop, you'll just wipe your results variable with each iteration - probably not what you want. You need to put the variable initialisation outside the loop.
Note, too, that I'd suggest that you are still better off using a single variable instead of multiple ones. list objects are very flexible in R and can accomodate objects of different classes. You can do
require(onemap)
data(example.out)
twopts <- rf.2pts(example.out)
all.data <- make.seq(twopts,"all")
link_gr <- group(all.data)
link_gr$n.groups
# initialise list outputs
grp = list()
grp.od = list()
for (i in 1:2){
grp[[i]] <- make.seq(link_gr, i)
grp.od[[i]] <- order.seq(input.seq=grp[[i]], n.init = 5, subset.search = "twopt",
twopt.alg = "rcd", THRES = 3, draw.try = TRUE, wait = 1, touchdown=TRUE)
}
#check out output
str(grp)
str(grp.od)
grp[[1]]
grp[[2]
If you must insist on using different variables, consider ?assign and ?get. Something like this will work:
i = 1
assign(paste("grp", i, sep = ""), grp[[1]])
exists("grp1")
str(get(paste("grp", i, sep = "")))
Related
In R I would like to loop over a set of three functions, with the output requiring saving from each function with a name related to the input. This works when applied to one file but I would like to loop over 300+ objects and the function requires specifying elements within the object.
I attempted to create lists of the objects and output names and looping over it with a for loop for a single function (a.ppp) and received an error "Error in i[["X"]] : subscript out of bounds". I am very new to for loops and have limited coding background and am unsure if the loop structure i have created is correct. I have tried multiple options including looping over a dataframe or nesting loops based on some other stack overflow questions.
Some toy data, representing my setup. I have dataframes eg. a-g
a <- data.frame(X = c(1, 2, 3),
Y = c(3,2,1),
Z = c(4,5,6),
M = c('A', 'B', 'C'))
I would like to loop over the following three functions.
library(spatstat)
a.ppp = ppp(a$X,a$Y,c(0,3),c(0,3),marks = a$M)
a.nnd = nndist(a.ppp,by=a.ppp$marks)
a.append = cbind(a,a.nnd)
My Attempt has included
listObj = c("a","b","c","d","e","f","g")
list.ppp = c("a.ppp","b.ppp","c.ppp","d.ppp","e.ppp","f.ppp","g.ppp")
for (i in listObj) {
for (j in list.ppp) {
j=ppp(i[["X"]],i[["Y"]],c(0,12),c(0,12),marks=i[["M"]])
}
}
I recieved the error:
#Error in i[["X"]] : subscript out of bounds
My Expected results would be a .ppp and .append output for a to g
Just Thought I'd Follow up, Based on the extremely helpful comment from Joran. I have figured the issue out through a modification of his provided code. The code I used was as follows
library(spatstat)
a <- data.frame(X = c(1, 2, 3),
Y = c(3,2,1),
Z = c(4,5,6),
M = c('A', 'B', 'C'))
#Create a list of all the vectors in the environment - Not an ideal method but
suitable for the case
dfs= mget(ls())
#Create empty lists to be populated during the loop
dfs_ppp = list()
dfs_nnd = list()
dfs_final= list()
for (i in seq_along(dfs)){
dfs_ppp[[i]] <- ppp(dfs[[i]]$X,dfs[[i]]$Y,c(-1,14),c(-1,14),marks = dfs[[i]]$M)
dfs_nnd[[i]] = nndist(dfs_ppp[[i]],by=dfs_ppp[[i]]$marks)
dfs_final[[i]] = cbind(dfs[[i]],dfs_nnd[[i]])
}
Try something more like this:
library(spatstat)
a <- data.frame(X = c(1, 2, 3),
Y = c(3,2,1),
Z = c(4,5,6),
M = c('A', 'B', 'C'))
# Put your data frames (a, b, c, etc.) in a list
dfs <- list(x = a,b = a,z = a)
for (i in seq_along(dfs)){
ppp_obj <- ppp(dfs[[i]]$X,dfs[[i]]$Y,c(0,3),c(0,3),marks = dfs[[i]]$M)
nnd_obj = nndist(ppp_obj,by=ppp_obj$marks)
dfs[[i]]$nnd <- nnd_obj
}
I am simulating dice throws, and would like to save the output in a single object, but cannot find a way to do so. I tried looking here, here, and here, but they do not seem to answer my question.
Here is my attempt to assign the result of a 20 x 3 trial to an object:
set.seed(1)
Twenty = for(i in 1:20){
trials = sample.int(6, 3, replace = TRUE)
print(trials)
i = i+1
}
print(Twenty)
What I do not understand is why I cannot recall the function after it is run?
I also tried using return instead of print in the function:
Twenty = for(i in 1:20){
trials = sample.int(6, 3, replace = TRUE)
return(trials)
i = i+1
}
print(Twenty)
or creating an empty matrix first:
mat = matrix(0, nrow = 20, ncol = 3)
mat
for(i in 1:20){
mat[i] = sample.int(6, 3, replace = TRUE)
print(mat)
i = i+1
}
but they seem to be worse (as I do not even get to see the trials).
Thanks for any hints.
There are several things wrong with your attempts:
1) A loop is not a function nor an object in R, so it doesn't make sense to assign a loop to a variable
2) When you have a loop for(i in 1:20), the loop will increment i so it doesn't make sense to add i = i + 1.
Your last attempt implemented correctly would look like this:
mat <- matrix(0, nrow = 20, ncol = 3)
for(i in 1:20){
mat[i, ] = sample.int(6, 3, replace = TRUE)
}
print(mat)
I personally would simply do
matrix(sample.int(6, 20 * 3, replace = TRUE), nrow = 20)
(since all draws are independent and with replacement, it doesn't matter if you make 3 draws 20 times or simply 60 draws)
Usually, in most programming languages one does not assign objects to for loops as they are not formally function objects. One uses loops to interact iteratively on existing objects. However, R maintains the apply family that saves iterative outputs to objects in same length as inputs.
Consider lapply (list apply) for list output or sapply (simplified apply) for matrix output:
# LIST OUTPUT
Twenty <- lapply(1:20, function(x) sample.int(6, 3, replace = TRUE))
# MATRIX OUTPUT
Twenty <- sapply(1:20, function(x) sample.int(6, 3, replace = TRUE))
And to see your trials, simply print out the object
print(Twenty)
But since you never use the iterator variable, x, consider replicate (wrapper to sapply which by one argument can output a matrix or a list) that receives size and expression (no sequence inputs or functions) arguments:
# MATRIX OUTPUT (DEFAULT)
Twenty <- replicate(20, sample.int(6, 3, replace = TRUE))
# LIST OUTPUT
Twenty <- replicate(20, sample.int(6, 3, replace = TRUE), simplify = FALSE)
You can use list:
Twenty=list()
for(i in 1:20){
Twenty[[i]] = sample.int(6, 3, replace = TRUE)
}
I need to execute this code many times in order to get 45 different matrices at the end: mat[j], j=1:45.
Not sure how to use "for-loop" to achieve that, will be grateful for any tips.
Data files are stored here, year-by-year https://intl-atlas-downloads.s3.amazonaws.com/index.html
library(readstata13)
library(diverse)
library(plyr)
for (j in 1:45) {
dat <- read.dta13(file.choose())
data = aggregate(dat$export_value, by = list(dat$exporter,dat$commoditycode), FUN = sum)
colnames(data) = c("land","product","value")
dt = split(data, f = data$product)
land = as.data.frame(sort(unique(data[, 1])))
nds = seq(1, nrow(land), by = 1)
texmat = cbind(nds, land)
colnames(texmat) = c("num", "land")
for (i in 1:length(unique(data[, 2]))) {
(join(texmat, dt[[i]], by = "land", type = "left")$value)
}
mt = sapply(1:length(unique(data[, 2])), function(i) join(texmat, dt[[i]], by = "land", type = "left")$value)
colnames(mt) = unique(data[, 2])
rownames(mt) = sort(unique(data[, 1]))
mt[is.na(mt)] = 0
rcamat=values(mt, category_row = FALSE, norm = "rca",filter = 1, binary = TRUE)
rcamat[is.na(rcamat)] = 0
tmat = rcamat[rowSums(rcamat) != 0, , drop = TRUE]
mat = t(tmat)
}
It looks like you're almost there with the for loop. You just need to add 2 concepts:
1) Creating a list of matrices to read at the start. A construction like:
filenames <- paste0('H0_',1995:2016,'.dta')
filenames <- c(filenames,paste0('S2_final_',1962:2016,'.dta'))
that creates a vector of the files you want to read will allow you to replace file.choose with something like the following (inside the loop):
dat <- read.dta13(paste0('/path/to/directory/with/files/',filenames[i]))
This way you can grab a new file with each loop iteration.
2) Storing the output matrices at the end of the loop. You can do this either by putting them all in a list, or by using assign to create a collection of objects. I prefer the list approach:
#before the for loop initialize a NULL list:
mats <- NULL
#at the end of the loop, (after mat = t(tmat) but before the close bracket) add this line to add it to the list
mats[[i]] <- mat
This will create a list mats with mats[[1]] holding the first matrix, mats[[2]] holding the second, and so on.
You could alternatively create a bunch of objects like so:
#at the end of the for loop add
assign(paste0('mat_',i),mat)
Which will create mat_1, mat_2, and so on as separate objects. A full implementation would look something like this:
library(readstata13)
library(diverse)
library(plyr)
setwd('/path/to/files/')
filenames <- paste0('H0_',1995:2016,'.dta')
filenames <- c(filenames,paste0('S2_final_',1962:2016,'.dta'))
#you'll have to prune this to the files you actually want, as this list is more than 45
finished_matrices <- NULL
for (j in 1:45) {
dat <- read.dta13(filenames[i]) #pickup
data = aggregate(dat$export_value, by = list(dat$exporter,dat$commoditycode), FUN = sum)
colnames(data) = c("land","product","value")
dt = split(data, f = data$product)
land = as.data.frame(sort(unique(data[, 1])))
nds = seq(1, nrow(land), by = 1)
texmat = cbind(nds, land)
colnames(texmat) = c("num", "land")
for (i in 1:length(unique(data[, 2]))) {
(join(texmat, dt[[i]], by = "land", type = "left")$value)
}
mt = sapply(1:length(unique(data[, 2])), function(i) join(texmat, dt[[i]], by = "land", type = "left")$value)
colnames(mt) = unique(data[, 2])
rownames(mt) = sort(unique(data[, 1]))
mt[is.na(mt)] = 0
rcamat=values(mt, category_row = FALSE, norm = "rca",filter = 1, binary = TRUE)
rcamat[is.na(rcamat)] = 0
tmat = rcamat[rowSums(rcamat) != 0, , drop = TRUE]
mat = t(tmat)
finished_matrices[[i]] <- mat
}
my problem is similar to the question as followingthe problem of R-input Format
I have tried the above code in the above link and revised some part to suit my data. my data is like follow
I want my data can be created as a data frame with 4 variable vectors. The code what I have revised is
formatMhsmm <- function(data){
nb.sequences = nrow(data)
nb.variables = ncol(data)
data_df <- data.frame(matrix(unlist(data), ncol = 4, byrow = TRUE))
# iterate over these in loops
rows <- 1: nb.sequences
# build vector with id value
id = numeric(length = nb.sequences)
for( i in rows)
{
id[i] = data_df[i,2]
}
# build vector with time value
time = numeric (length = nb.sequences)
for( i in rows)
{
time[i] = data_df[i,3]
}
# build vector with observation values
sequences = numeric(length = nb.sequences)
for(i in rows)
{
sequences[i] = data_df[i, 4]
}
data.df = data.frame(id,time,sequences)
# creation of hsmm data object need for training
N <- as.numeric(table(data.df$id))
train <- list(x = data.df$sequences, N = N)
class(train) <- "hsmm.data"
return(train)
}
library(mhsmm)
dataset <- read.csv("location.csv", header = TRUE)
train <- formatMhsmm(dataset)
print(train)
The output observation is not the data of 4th col, it's a list of (4, 8, 12,...,396, 1, 1, ..., 56, 192,...,6550, 68, NA, NA,...) It has picked up 1/4 data of each col. Why it is like this?
Thank you very much!!!!
Why don't you simply count yout observations by Id, and create the hsmm.data object directly? Supposing yout dataframe is called "data", we have:
N <- as.numeric(table(data$id))
train <- list(x=data$location, N = N)
class(train) <- "hsmm.data"
Extracted from http://www.jstatsoft.org/v39/i04/paper
I just discovered the power of plyr frequency table with several variables in R
and I am still struggling to understand how it works and I hope some here can help me.
I would like to create a table (data frame) in which I can combine frequencies and summary stats but without hard-coding the values.
Here an example dataset
require(datasets)
d1 <- sleep
# I classify the variable extra to calculate the frequencies
extraClassified <- cut(d1$extra, breaks = 3, labels = c('low', 'medium', 'high') )
d1 <- data.frame(d1, extraClassified)
The results I am looking for should look like that :
require(plyr)
ddply(d1, "group", summarise,
All = length(ID),
nLow = sum(extraClassified == "low"),
nMedium = sum(extraClassified == "medium"),
nHigh = sum(extraClassified == "high"),
PctLow = round(sum(extraClassified == "low")/ length(ID), digits = 1),
PctMedium = round(sum(extraClassified == "medium")/ length(ID), digits = 1),
PctHigh = round(sum(extraClassified == "high")/ length(ID), digits = 1),
xmean = round(mean(extra), digits = 1),
xsd = round(sd(extra), digits = 1))
My question: how can I do this without hard-coding the values?
For the records:
I tried this code, but it does not work
ddply (d1, "group",
function(i) c(table(i$extraClassified),
prop.table(as.character(i$extraClassified))),
)
Thanks in advance
Here's an example to get you started:
foo <- function(x,colfac,colval){
tbl <- table(x[,colfac])
res <- cbind(n = nrow(x),t(tbl),t(prop.table(tbl)))
colnames(res)[5:7] <- paste(colnames(res)[5:7],"Pct",sep = "")
res <- as.data.frame(res)
res$mn <- mean(x[,colval])
res$sd <- sd(x[,colval])
res
}
ddply(d1,.(group),foo,colfac = "extraClassified",colval = "extra")
Don't take anything in that function foo as gospel. I just wrote that off the top of my head. Surely improvements/modifications are possible, but at least it's something to start with.
Thanks to Joran.
I slighlty modified your function to make it more generic (without reference to the position of the variables) .
require(plyr)
foo <- function(x,colfac,colval)
{
# table with frequencies
tbl <- table(x[,colfac])
# table with percentages
tblpct <- t(prop.table(tbl))
colnames( tblpct) <- paste(colnames(t(tbl)), 'Pct', sep = '')
# put the first part together
res <- cbind(n = nrow(x), t(tbl), tblpct)
res <- as.data.frame(res)
# add summary statistics
res$mn <- mean(x[,colval])
res$sd <- sd(x[,colval])
res
}
ddply(d1,.(group),foo,colfac = "extraClassified",colval = "extra")
and it works !!!
P.S : I still do not understand what (group) stands for but