How can I translate the following code of R to Julia? I am new in Julia.
I know Julia has different ways to replace for loop.
max_trial <- max(dataframe[,1])
max_c1 <- NA
for(c in 1:max_trial){
c1 <- which(dataframe[,1]==c)
max_measure <- max(dataframe[c1,2])
max_c1[c] <- max_measure
}
As suggested I applied the following code translate
max_c1= []
for c in 1:max_trial
c1 = findall(dataframe[:,1] .== c)
max_c1[c] = maximum(dataframe[c1,2])
end
But I received the following error
ERROR: BoundsError: attempt to access 0-element Array{Any,1} at index [1]
Also the values received from this translation “ maximum(dataframe[c1,2])” is still different than The R code. It seems for this part of error some adjustment of the syntax needs improvement.
I think the corresponding Julia code would look like
for c in 1:max_trial
c1 = findall(dataframe[:,1] .== c)
max_c1[c] = maximum(dataframe[c1,2])
end
although I think you did not give enough information to completely answer your question, so I'm not really sure. Maybe adding the data you used and the output you are looking for in your question would help?
Related
It's amazing that the internet is totally void of this simple question (or similar). Or I'm just very bad at searching. Anyway, I simply want to store values generated by a for-loop in an array and print the array. Simple as that.
On every other language Matlab, R, Python, Java etc this is very simple. But in Julia I seem to be missing something.
using JuMP
# t = int64[] has also been tested
t = 0
for i in 1:5
vector[i]
println[vector]
end
I get the error
ERROR: LoadError: BoundsError
What am I missing?
You didn't initialize vector and you should call the method println like this following way, in Julia 1.0 :
vector = Array{Int,1}(undef, 5)
for i in 1:5
vector[i] = i
println(vector[i])
end
Or, more quickly, with a comprehension list :
vector = [i for i in 1:5]
for i in 1:5
println(vector[i])
end
Another possibility using push! method :
vector = []
for i in 1:5
push!(vector, i)
println(vector[i])
end
I have just started learning R from the O'Reilly book "Learning R". It says that colon is used to create a sequence of numbers, while c function is used to concatenate it into a vector. I tried the following in R Studio:
a = 1:5
and the following:
a = c(1:5)
And they both seemed to have the same effect. Even statements like the ones below had the same effect:
a = 1:5 + 1:5
a = c(1:5 + 1:5)
I tried checking their datatype using class() and typeof() and they are the same too. Maybe I do not understand the importance as I have just started, but can someone please explain why c function needs to be used instead of just creating a sequence?
Thank you.
I'm sure this is kind of basic, but I'd just like to really understand the logic of R data structures here.
If I subset a matrix by index out of bounds, I get exactly that error:
m <- matrix(data = c("foo", "bar"), nrow = 1)
m[2,]
# Error in m[2, ] : subscript out of bounds
If I do the same do a data frame, however, I get all NA rows:
df <- data.frame(foo = "foo", bar = "bar")
df[2,]
# foo bar
# NA <NA> <NA>
If I subset into a non-existent data frame column I get the familiar
df[, 3]
# Error in `[.data.frame`(df, , 3) : undefined columns selected
I know (roughly) that data frame rows are weird and to be treated carefully, but I don't quite see the connection to the above behavior.
Can someone explain why R behaves in this way for non-existent df rows?
Update
To be sure, giving NA on out-of-bounds subsets, is normal R behavior for 1D vectors:
vec <- c("foo", "bar")
vec[3]
# [1] NA
So in a way, the weird one out here is matrix subsetting, not dataframe subsetting, depending from where you're starting out.
Still the different 2D subsetting behavior (m[2, ] vs df[2, ]) might strike a dense user (as I am right now) as inconsistent.
Can someone explain why R behaves in this way[?]
Short answer: No, probably not.
Longer answer:
Once upon a time I was thinking about something similar and read this thread on R-devel: Definition of [[. Basically it boils down to:
The semantics of [ and [[ don't seem to be fully specified in the Reference manual. [...] I assume that these are features, not bugs, but I can't find documentation for them
Duncan Murdoch, a former member of the R core team gives a very nice reply:
There is more documentation in the man page for Extract, but I think it is incomplete. The most complete documentation is of course the source code*, but it may not answer the question of what's intentional and what's accidental
As mentioned in the R-devel thread, the only description in the manual is 3.4.1 Indexing by vectors:
If i is positive and exceeds length(x) then the corresponding selection is NA
But, this applies to "indexing of simple vectors". Similar out of bounds indexing for "non-simple" vectors does not seem to be described. Duncan Murdoch again:
So what is a simple vector? That is not explicitly defined, and it probably should be.
Thus, it may seem like no one knows the answer to your why question.
See also "8.2.13 nonexistent value in subscript" in the excellent R Inferno by Patrick Burns, and the section "Missing/out of bounds indices" in Hadley's book.
*Source code for the [ subset operator. A search for R_MSG_subs_o_b (which corresponds to error message "subscript out of bounds") provides no obvious clue why OOB [ indexing of matrices and when using [[ give an error, whereas OOB [ indexing of "simple vectors" results in NA.
I am trying to write an R function that takes a data set and outputs the plot() function with the data set read in its environment. This means you don't have to use attach() anymore, which is good practice. Here's my example:
mydata <- data.frame(a = rnorm(100), b = rnorm(100,0,.2))
plot(mydata$a, mydata$b) # works just fine
scatter_plot <- function(ds) { # function I'm trying to create
ifelse(exists(deparse(quote(ds))),
function(x,y) plot(ds$x, ds$y),
sprintf("The dataset %s does not exist.", ds))
}
scatter_plot(mydata)(a, b) # not working
Here's the error I'm getting:
Error in rep(yes, length.out = length(ans)) :
attempt to replicate an object of type 'closure'
I tried several other versions, but they all give me the same error. What am I doing wrong?
EDIT: I realize the code is not too practical. My goal is to understand functional programming better. I wrote a similar macro in SAS, and I was just trying to write its counterpart in R, but I'm failing. I just picked this as an example. I think it's a pretty simple example and yet it's not working.
There are a few small issues. ifelse is a vectorized function, but you just need a simple if. In fact, you don't really need an else -- you could just throw an error immediately if the data set does not exist. Note that your error message is not using the name of the object, so it will create its own error.
You are passing a and b instead of "a" and "b". Instead of the ds$x syntax, you should use the ds[[x]] syntax when you are programming (fortunes::fortune(312)). If that's the way you want to call the function, then you'll have to deparse those arguments as well. Finally, I think you want deparse(substitute()) instead of deparse(quote())
scatter_plot <- function(ds) {
ds.name <- deparse(substitute(ds))
if (!exists(ds.name))
stop(sprintf("The dataset %s does not exist.", ds.name))
function(x, y) {
x <- deparse(substitute(x))
y <- deparse(substitute(y))
plot(ds[[x]], ds[[y]])
}
}
scatter_plot(mydata)(a, b)
I have a data frame that looks like so:
pid tid pname
2 NA proc/boot/procnto-smp-instr
Now if I do this, I expect nothing to happen:
y[c(FALSE), "pid"] <- 10
And nothing happens (y did not change). However, if I do this:
y[c(FALSE), ]$pid <- 10
I get:
Error in $<-.data.frame(*tmp*, "pid", value = 10) :
replacement
has 1 rows, data has 0
So my question is, what's the difference between [, "col"]<- and $col<-? Why does one throw an exception? And bonus: where in the docs can I read more about this?
The error comes from the code of $<-.data.frame which checks if the original data.frame is at least as many rows as the length of the replacement vector:
nrows <- .row_names_info(x, 2L)
if (!is.null(value)) {
N <- NROW(value)
if (N > nrows)
stop(sprintf(ngettext(N, "replacement has %d row, data has %d",
"replacement has %d rows, data has %d"), N, nrows),
domain = NA)
[<- is a different function, which does not perform this check. It is a primitive function, which you can read more about in the R Internals manual
For once, these operations are performed by two very different functions:
y[FALSE, 'pid'] <- 10 is the call to the [<-.data.frame function, while
y[FALSE, ]$pid <- 10 is the call to the $<-.data.frame function, the error message gives you this clue. Just how different they are you can see by typing their names (with back quotes, just like above). In this particular case, though, they intended to behave the same way. And they normally do. Try y[1, 'pid'] <- 1:3 vs y[1, ]$pid <- 1:3. Your case is "special" as y[FALSE, ] returns you a "strange" object - a data.frame with 0 rows and three columns. IMHO, throwing exception is a correct behavior, and this is a minor bug in the [<-.data.frame function, but language developers's opinion on this subject is more important. If you want to see yourself where the difference is, type debug([<-.data.frame) and run your example.
The answer to your "bonus" question is to type ?[<-.data.frame and read, though it is very, very dry :(. Best.
PS. Formatting strips backticks, so, for instance, [<-.data.frame meant to be . Sorry.