I want to use iteration to turn the entries in a list into a 2x2 matrix, and then assign the same column and row names to these tables, as well as integer values for the matrix cells.
For examples sake let's pretend this is the list with the entries whose names I want to turn into matrices:
cnames <- c("Honda", "Toyota", "Nissan")
Creating the tables themselves seem to work fine with the assign function:
for (i in 1:length(cnames)){
assign(paste(cnames[i],"table",sep="_"), matrix(,nrow=2,ncol=2))
}
Which when I type, for instance:
> Honda_table
...returns:
[,1] [,2]
[1,] NA NA
[2,] NA NA
But if in the original iterative function I try to assign column names, like such:
for (i in 1:length(cnames)){
assign(paste(cnames[i],"table",sep="_"), matrix(,nrow=2,ncol=2))
colnames(paste(cnames[i],"table",sep="_")) <- c("A","B")
}
...I get this error instead:
Error : attempt to set 'colnames' on an object with less than two dimensions
I don't understand why this is coming up, since after using the original assign function, if I look up the dimensions any of the tables, such as:
>dim(honda_table)
...I get:
[1] 2 2
Which indicates it is a 2x2 dimensional object.
Moreover, I cannot assign pre-set values to the matrix cells, like so:
for (i in 1:length(cnames)){
assign(paste(cnames[i],"table",sep="_"), matrix(,nrow=2,ncol=2))
paste(cnames[i],"table",sep="_")[1,1] = 1
}
...without getting this error:
Error : incorrect number of subscripts on matrix
What is going on here?
Thanks.
I am not sure it is the best, and the most beautiful, way but seems to work:
for (i in 1:length(cnames)){
tab<- matrix(,nrow=2,ncol=2)
colnames(tab)<- c("A","B")
assign(paste(cnames[i],"table",sep="_"), tab)
}
rm(tab)
After much suggestion I ended up scraping the assign function and simply created a vector of tables instead
Related
Just used "str_match_all" as follows:
a <- str_match_all(dd, '\\d+(\\w+)')`
and obtained the following:
#[[1]]
# [,1] [,2]
#[1,] "12hours" "hours"
#[2,] "23days" "days"
How can I access each string?
I have tried a[1][,1] to access the first column for example but I get an error saying the number of dimensions is not correct.
If I understand your problem correctly, you are having trouble accessing each individual element.
I think you have to remember that your output is a list and the element in that list is a matrix. Therefore to access each individual element you first have to invoke which element of the list you are interested in and then the row and then the column.
a[[1]][1,2]
So in your case, this will access the first element in your list (looks like you only have 1), and then the 1st row and then the 2nd column so it will give you, "hours".
If however, you're more used to working with dataframes as I assume that is your end goal, I would approach this programmatically as follows:
Taking an example from the str_match_all() documentation
# Creating reproduceable example
strings <- c("Home: 219 733 8965. Work: 229-293-8753 ",
"banana pear apple", "595 794 7569 / 387 287 6718")
phone <- "([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})"
Your goal is to convert the matrix in to a data frame. Which you do as follows
as.data.frame(a[[1]])
For future reference, lets say your output is more than 1 element as is the case in this example, you should approach the solution like so:
# Make a function that accepts your list variable.
# Copy and paste the step before and then add an extra step using dplyr::bind_rows()
output_to_df <- function(x){
a <- as.data.frame(x)
bind_rows(a)
}
# Using this function we will then use map_dfr()
# so that we can apply our premade function on all elements
# of our list no matter how many elements it contains
str_output <- map_df(a, output_to_df)
You can now reuse your output_to_df() function as many times as you need.
I have multiple dataframes and would like to remove the first row in all of them.
I have tried using a for loop but cannot understand what I am doing wrong
for (i in cities){
i <- i[-1, ]
}
I get the following error code:
Error in i[-1, ] : incorrect number of dimensions
If we assume that the only objects in your workspace are dataframes then this might succeed:
cities <- objects() )
for (i in cities) { assign(i, get(i)[-1,])}
Explanation:
Two thing wrong with original codes:
One was already mentioned in comments. "df" is not the same as df. You need to use get to convert a character value to a "true" R name that is used to retrieve an object having that name. The result of object() is only a character value. In R the term "name" means a "language object". See the help page: ?mode. (There is potential confusion about rownames and columnnames which are always "character"-class.) It's not like SAS which is a macro language that has no such distinction.
The second error was trying to get substitution for the i on the left-hand side of <-. The would have failed even if you were working with actual R names. The assign function is designed to handle character values that are then converted to R names.
say you get a list of all the tables in your environment, and you call that list cities. You can't just iterate over each value of cities and change things, because in the list they are just characters.
Here is what you need:
for (i in cities){
tmp <- get(i) # load the actual table
tmp <- tmp[-1, ] # remove first column
assign(i, tmp) # re-assign table to original table name
}
I am currently in a statistics class working on multivariate clustering and classification. For our homework we are trying to use a 10 fold cross validation to test how accurate different classification methods are on a 6 variable data set with three classifications. I was hoping I could get some help on creating a for loop (or something else which would be better that I don't know about) to create and run 10 classifications and validations so I don't have to repeat myself 10 times on everything.... Here is what I have. It will run but the first two matrices only show the first variable. Because of this, I have not been able to troubleshoot the other parts.
index<-sample(1:10,90,rep=TRUE)
table(index)
training=NULL
leave=NULL
Trfootball=NULL
football.pred=NULL
for(i in 1:10){
training[i]<-football[index!=i,]
leave[i]<-football[index==i,]
Trfootball[i]<-rpart(V1~., data=training[i], method="class")
football.pred[i]<- predict(Trfootball[i], leave[i], type="class")
table(Actual=leave[i]$"V1", classfied=football.pred[i])}
Removing the "[i]" and replacing them with 1:10 individually works right now....
Your problem lies is the assignment of a data.frame or matrix to a vector that you initially set as NULL (training and leave). A way to think about it is, you are trying to squeeze in a whole matrix into an element that can only take a single number. That's why R has a problem with your code. You need to initialise training and leave to something that can handle your iterative agglomeration of values (the R object list as #akrun points out).
The following example should give you a feel for what is happening and what you can do to fix your problem:
a<-NULL # your set up at the moment
print(a) # NULL as expected
# your football data is either data.frame or matrix
# try assigning those objects to the first element of a:
a[1]<-data.frame(1:10,11:20) # no good
a[1]<-matrix(1:10,nrow=2) # no good either
print(a)
## create "a" upfront, instead of an empty object
# what you need:
a<-vector(mode="list",length=10)
print(a) # empty list with 10 locations
## to assign and extract elements out of a list, use the "[[" double brackets
a[[1]]<-data.frame(1:10,11:20)
#access data.frame in "a"
a[1] ## no good
a[[1]] ## what you need to extract the first element of the list
## how does it look when you add an extra element?
a[[2]]<-matrix(1:10,nrow=2)
print(a)
I am trying to use apply() to fill in an additional column in a dataframe and by calling a function I created with each row of the data frame.
The dataframe is called Hit.Data has 2 columns Zip.Code and Hits. Here are a few rows
Zip.Code , Hits
97222 , 20
10100 , 35
87700 , 23
The apply code is the following:
Hit.Data$Zone = apply(Hit.Data, 1, function(x) lookupZone("89000", x["Zip.Code"]))
The lookupZone() function is the following:
lookupZone <- function(sourceZip, destZip){
sourceKey = substr(sourceZip, 1, 3)
destKey = substr(destZips, 1, 3)
return(zipToZipZoneMap[[sourceKey]][[destKey]])
}
All the lookupZone() function does is take the 2 strings, truncates to the required characters and looks up the values. What happens when I run this code though is that R assigns a list to Hit.Data$Zone instead of filling in data row by row.
> typeof(Hit.Data$Zone)
[1] "list
What baffles me is that when I use apply and just tell it to put a number in it works correctly:
> Hit.Data$Zone = apply(Hit.Data, 1, function(x) 2)
> typeof(Hit.Data$Zone)
[1] "double"
I know R has a lot of strange behavior around dropping dimensions of matrices and doing odd things with lists but this looks like it should be pretty straightforward. What am I missing? I feel like there is something fundamental about R I am fighting, and so far it is winning.
Your problem is that you are occasionally looking up non-existing entries in your hashmap, which causes hash to silently return NULL. Consider:
> hash("890", hash("972"=3, "101"=3, "877"=3))[["890"]][["101"]]
[1] 3
> hash("890", hash("972"=3, "101"=3, "877"=3))[["890"]][["100"]]
NULL
If apply encounters any NULL values, then it can't coerce the result to a vector, so it will return a list. Same will happen with sapply.
You have to ensure that all possible combinations of the first three zip code digits in your data are present in your hash, or you need logic in your code to return NA instead of NULL for missing entries.
As others have said, it's hard to diagnose without knowing what ZiptoZipZoneMap(...) is doing, but you could try this:
Hit.Data$Zone <- sapply(Hit.Data$Zip.Code, function(x) lookupZone("89000", x))
I've created a list of matrices in R. In all matrices in the list, I'd like to "pull out" the collection of matrix elements of a particular index. I was thinking that the colon operator might allow me to implement this in one line. For example, here's an attempt to access the [1,1] elements of all matrices in a list:
myList = list() #list of matrices
myList[[1]] = matrix(1:9, nrow=3, ncol=3, byrow=TRUE) #arbitrary data
myList[[2]] = matrix(2:10, nrow=3, ncol=3, byrow=TRUE)
#I expected the following line to output myList[[1]][1,1], myList[[2]][1,1]
slice = myList[[1:2]][1,1] #prints error: "incorrect number of dimensions"
The final line of the above code throws the error "incorrect number of dimensions."
For reference, here's a working (but less elegant) implementation of what I'm trying to do:
#assume myList has already been created (see the code snippet above)
slice = c()
for(x in 1:2) {
slice = c(slice, myList[[x]][1,1])
}
#this works. slice = [1 2]
Does anyone know how to do the above operation in one line?
Note that my "list of matrices" could be replaced with something else. If someone can suggest an alternative "collection of matrices" data structure that allows me to perform the above operation, then this will be solved.
Perhaps this question is silly...I really would like to have a clean one-line implementation though.
Two things. First, the difference between [ and [[. The relevant sentence from ?'[':
The most important distinction between [, [[ and $ is that the [ can
select more than one element whereas the other two select a single
element.
So you probably want to do myList[1:2]. Second, you can't combine subsetting operations in the way you describe. Once you do myList[1:2] you will get a list of two matrices. A list typically has only one dimension, so doing myList[1:2][1,1] is nonsensical in your case. (See comments for exceptions.)
You might try lapply instead: lapply(myList,'[',1,1).
If your matrices will all have same dimension, you could store them in a 3-dimensional array. That would certainly make indexing and extracting elements easier ...
## One way to get your data into an array
a <- array(c(myList[[1]], myList[[2]]), dim=c(3,3,2))
## Extract the slice containing the upper left element of each matrix
a[1,1,]
# [1] 1 2
This works:
> sapply(myList,"[",1,1)
[1] 1 2
edit: oh, sorry, I see almost the same idea toward the end of an earlier answer. But sapply probably comes closer to what you want, anyway