Order of column in R - r

I want to get the number in order of the column in a dataframe.
df <- data.frame(item = rep(c('a','b','c'), 3),
year = rep(c('2010','2011','2012'), each=3),
count = c(1,4,6,3,8,3,5,7,9))
Lets say the function i am looking for is columnorder. I want to have this result
x <- columnorder(df$count)
x
> 3
x <- columnorder(df$item)
x
> 1
It seems like a basic task but I couldn't find the answer until now. I will appreciate your help. Thank you

You said,
It seems like a basic task but I couldn't find the answer until now.
In the general sense what you are trying to do -- translate a column name into a column index -- is basic, and a pretty common question. However, the particular scenario you describe above, where your input is of the form object_name$column_name, is atypical WRT what you are trying to achieve, which is most likely why you haven't found an existing solution.
In short, the problem is that when you pass an argument as df$count, you may as well just have used c(1,4,6,3,8,3,5,7,9) instead, because df$count will be evaluated as c(1,4,6,3,8,3,5,7,9). Of course, R does allow for a fair bit of metaprogramming, so with a little extra work, this could be implemented as, for example
column_order <- function(expr) {
x <- strsplit(deparse(substitute(expr)), "$", TRUE)[[1]]
match(x[2], names(get(x[1])))
}
column_order(df$item)
#[1] 1
column_order(df$year)
#[1] 2
column_order(df$count)
#[1] 3
But as I said above, this is an atypical interface for what you are ultimately trying to do. A much more standard approach would be for this function to accept the column name (typically as a string) and the target object as arguments, in which case the solution is much simpler:
column_order2 <- function(col, obj) match(col, names(obj))
column_order2("item", df)
#[1] 1
column_order2("year", df)
#[1] 2
column_order2("count", df)
#[1] 3

As proposed in the comments by #mtoto, here is one solution:
x <- which(colnames(df) == "count")

Related

How can manipulate different variables with similar names in a for loop in r? [duplicate]

Similar questions have been raised for other languages: C, sql, java, etc.
But I'm trying to do this in R.
I have:
ret_series <- c(1, 2, 3)
x <- "ret_series"
How do I get (1, 2, 3) by calling some function / manipulation on x, without direct mentioning of ret_series?
You provided the answer in your question. Try get.
> get(x)
[1] 1 2 3
For a one off use, the get function works (as has been mentioned), but it does not scale well to larger projects. it is better to store you data in lists or environments, then use [[ to access the individual elements:
mydata <- list( ret_series=c(1,2,3) )
x <- 'ret_series'
mydata[[x]]
What's wrong with either of the following?
eval(as.name(x))
eval(as.symbol(x))
Note that some of the examples above wouldn't work for a data.frame.
For instance, given
x <- data.frame(a=seq(1,5))
get("x$a") would not give you x$a.

how to emulate parameters passed by reference [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Is there a way?
NB: the question is not whether it is right, good or sensible to do such a thing.
The question is if there is a way, so if your answer would be
"why would you want to do that?" "R uses functions what you want was once called procedure and good R usage/style does not ...", "could you explain better... provide some code" do NOT answer.
I did a quick try, that did not work eventually worked, using environments, more or less:
function(mydf) {
varName <- deparse(substitute(mydf))
...
assign(varName,mydf,envir=parent.frame(n = 1))
}
1) Wrap the function body in eval.parent(substitute({...})) like this:
f <- function(x) eval.parent(substitute({
x <- x + 1
}))
mydf <- data.frame(z = 1)
f(mydf)
mydf
## z
## 1 2
Also see the defmacro function in gtools and the wrapr package.
2) An alternative might be to use a replacement function:
"incr<-" <- function(x, value) {
x + value
}
mydf <- data.frame(z = 1)
incr(mydf) <- 1
mydf
## z
## 1 2
3) or just overwrite the input:
f2 <- function(x) x + 1
mydf <- data.frame(z = 1)
mydf <- f2(mydf)
mydf
## z
## 1 2
If the problem is that there are multiple outputs then use list in the gsubfn package. This is used on the left hand side of an assignment with square brackets as shown. See help(list, gsubfn)
library(gsubfn)
f3 <- function(x, y) list(x + 1, y + 2)
mydf <- mydf2 <- data.frame(z = 1)
list[mydf, mydf2] <- f3(mydf, mydf2)
mydf
## z
## 1 2
mydf2
## z
## 1 3
At least for my specific/limited needs I found a solution
myVar = 11
myF <- function(x) {
varName <- deparse(substitute(x))
# print(paste("var name is", varName))
x = 99
assign(varName,x,envir=parent.frame(n = 1))
NA # sorry this is not a function
# in real life sometimes you also need procedures
}
myF(myVar)
print(myVar)
# [1] 99
I think there is no way to emulate call-by-reference. However, several tricks can be used from case to case:
globals: It is, of course, possible to have a global variable instead of the parameter. This can be written from within a function using <<- instead of = or <-. In this way, many cases of needing call-by-reference vanish.
However, this is not compatible with parallelization and also not compatible with recursion.
When you need recursion, you can do very much the same and have a global stack. Before the recursive call, you have to append to this stack and as the first line of your function, you can get the index (similar to a stack pointer in CPUs) in order to write to the global stack.
Both approaches are not encouraged and should be used as a last resort or for education. If you really can't avoid call-by-reference, go to C++ with Rcpp and write a C++-function that does your heavy loading. If needed, it can actually call R functions. Look at some Rcpp tutorials, most of them cover this case...

R- Please help. Having trouble writing for loop to lag date

I am attempting to write a for loop which will take subsets of a dataframe by person id and then lag the EXAMDATE variable by one for comparison. So a given row will have the original EXAMDATE and also a variable EXAMDATE_LAG which will contain the value of the EXAMDATE one row before it.
for (i in length(uniquerid))
{
temp <- subset(part2test, RID==uniquerid[i])
temp$EXAMDATE_LAG <- temp$EXAMDATE
temp2 <- data.frame(lag(temp, -1, na.pad=TRUE))
temp3 <- data.frame(cbind(temp,temp2))
}
It seems that I am creating the new variable just fine but I know that the lag won't work properly because I am missing steps. Perhaps I have also misunderstood other peoples' examples on how to use the lag function?
So that this can be fully answered. There are a handful of things wrong with your code. Lucaino has pointed one out. Each time through your loop you are going to create temp, temp2, and temp3 (or overwrite the old one). and thus you'll be left with only the output of the last time through the loop.
However, this isnt something that needs a loop. Instead you can make use of the vectorized nature of R
x <- 1:10
> c(x[-1], NA)
[1] 2 3 4 5 6 7 8 9 10 NA
So if you combine that notion with a library like plyr that splits data nicely you should have a workable solution. If I've missed something or this doesn't solve your problem, please provide a reproducible example.
library(plyr)
myLag <- function(x) {
c(x[-1], NA)
}
ddply(part2test, .(uniquerid), transform, EXAMDATE_LAG=myLag(EXAMDATE))
You could also do this in base R using split or the data.table package using its by= argument.

Assigning output of a function to two variables in R [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
function with multiple outputs
This seems like an easy question, but I can't figure it out and I haven't had luck in the R manuals I've looked at. I want to find dim(x), but I want to assign dim(x)[1] to a and dim(x)[2] to b in a single line.
I've tried [a b] <- dim(x) and c(a, b) <- dim(x), but neither has worked. Is there a one-line way to do this? It seems like a very basic thing that should be easy to handle.
This may not be as simple of a solution as you had wanted, but this gets the job done. It's also a very handy tool in the future, should you need to assign multiple variables at once (and you don't know how many values you have).
Output <- SomeFunction(x)
VariablesList <- letters[1:length(Output)]
for (i in seq(1, length(Output), by = 1)) {
assign(VariablesList[i], Output[i])
}
Loops aren't the most efficient things in R, but I've used this multiple times. I personally find it especially useful when gathering information from a folder with an unknown number of entries.
EDIT: And in this case, Output could be any length (as long as VariablesList is longer).
EDIT #2: Changed up the VariablesList vector to allow for more values, as Liz suggested.
You can also write your own function that will always make a global a and b. But this isn't advisable:
mydim <- function(x) {
out <- dim(x)
a <<- out[1]
b <<- out[2]
}
The "R" way to do this is to output the results as a list or vector just like the built in function does and access them as needed:
out <- dim(x)
out[1]
out[2]
R has excellent list and vector comprehension that many other languages lack and thus doesn't have this multiple assignment feature. Instead it has a rich set of functions to reach into complex data structures without looping constructs.
Doesn't look like there is a way to do this. Really the only way to deal with it is to add a couple of extra lines:
temp <- dim(x)
a <- temp[1]
b <- temp[2]
It depends what is in a and b. If they are just numbers try to return a vector like this:
dim <- function(x,y)
return(c(x,y))
dim(1,2)[1]
# [1] 1
dim(1,2)[2]
# [1] 2
If a and b are something else, you might want to return a list
dim <- function(x,y)
return(list(item1=x:y,item2=(2*x):(2*y)))
dim(1,2)[[1]]
[1] 1 2
dim(1,2)[[2]]
[1] 2 3 4
EDIT:
try this: x <- c(1,2); names(x) <- c("a","b")

Access variable value where the name of variable is stored in a string

Similar questions have been raised for other languages: C, sql, java, etc.
But I'm trying to do this in R.
I have:
ret_series <- c(1, 2, 3)
x <- "ret_series"
How do I get (1, 2, 3) by calling some function / manipulation on x, without direct mentioning of ret_series?
You provided the answer in your question. Try get.
> get(x)
[1] 1 2 3
For a one off use, the get function works (as has been mentioned), but it does not scale well to larger projects. it is better to store you data in lists or environments, then use [[ to access the individual elements:
mydata <- list( ret_series=c(1,2,3) )
x <- 'ret_series'
mydata[[x]]
What's wrong with either of the following?
eval(as.name(x))
eval(as.symbol(x))
Note that some of the examples above wouldn't work for a data.frame.
For instance, given
x <- data.frame(a=seq(1,5))
get("x$a") would not give you x$a.

Resources