How to call last unit in data.frame without using length()? - r

Usually when I want to define the last unknown unit number when calling a series I would use;
z <- length(data)
mean(data[3:z])
However isn't there a much simpler way to define the last unit in the same statement without having to call and define length as a separate variable? Like a special symbol to imply the last unit.

mean(data[3:length(data)]) should work if you don't want an exra variable..?

There isn't really a shortcut for that but instead of selecting from 3 to length, you can also remove first 2 elements which can be done using indexing :
data[-(1:2)]
Or using tail
tail(data, -2)

We can use
data[tail(names(data), -2)]

Related

create a vector increased by 0.5 in julia

Suppose I want to create a vector in Julia. The vector should be [0,0.5,1,1.5,2,....20].
What's the quick command to create a vector like this, from 0 to 20, increasing by 0.5?
This will create a range, which is a kind of vector:
0:0.5:20
This does the same thing
range(0, 20; step=0.5)
If you absolutely need to make a Vector, you can use collect on either of these, but in most cases you should not, and just use 0:0.5:20 directly.
There are many ways to do this. While I prefer the colon array syntax already mentioned, this could be used for more general construction:
[0.5*i for i in 1:20]

Assign a Value based on the numbers in a separate columns in R

So I kind of already know the possible solution but I don't know how to exactly go about it so please give me a bit of grace here.
I have a dataset for youtube trends that I want to read the values from two columns (likes and dislikes) and based off their contents I want an entry to be made in the new column. If the likes are higher than the dislikes I want it to be said as a 'positive' video and if it has more dislikes it should be 'negative'.
I'm primarily not sure how to go about this since most of the previous asks are based off of one column rather than two. I know some mentioned using cut, but would it still work the same?
all help is appreciated, thanks.
You can use a simple ifelse :
df$new_col <- ifelse(df$likes > df$dislikes, 'positive', 'negative')
This can also be written without ifelse as :
df$new_col <- c('negative', 'positive')[as.integer(df$likes > df$dislikes) + 1]
You can use Vectorize to create a vectorized version of a function. vfunc <- Vectorize(func) will allow you to call df$newcol <- vfunc(df$likes, df$dislikes) if your function takes two arguments and then return the result for each row in a vector that's assigned to a new column.

Julia JuMP - array index which is an index of another array

I have to solve a problem with permutations. The function takes vector a with n elements as a parameter. I declare b as #variable - there should be the permutation 1:n that gives the best result after finding the solution of a problem.
The error appears when I want to create #constraint. I have to use a[b[1]], so it takes the first element from vector which is a variable. It gives my error, that I can't use type VariableRef as a index of an array. But how can I get around this when I have to use it?
I sounds as if you have two optimisation problems one of which is an integer programming problem. You might think about separating the two.
(Sorry for not writing a comment, my reputation is still too low ;-) )

find indexes in R by not using `which`

Is there a faster way to search for indices rather than which %in% R.
I am having a statement which I need to execute but its taking a lot of time.
statement:
total_authors<-paper_author$author_id[which(paper_author$paper_id%in%paper_author$paper_id[which(paper_author$author_id%in%data_authors[i])])]
How can this be done in a faster manner?
Don't call which. R accepts logical vectors as indices, so the call is superfluous.
In light of sgibb's comment, you can keep which if you are sure that you will also get at least one match. (If there are no matches, then which returns an empty vector and you get everything instead of nothing. See Unexpected behavior using -which() in R when the search term is not found.)
Secondly, the code looks a little cleaner if you use with.
Thirdly, I think you want a single index with & rather than a double index.
total_authors <- with(
paper_author,
author_id[paper_id %in% paper_id & author_id %in% data_authors[i]
)

Searching an ordered "list" matching condition when nothing matches the condition, list length = 1

I have a sorted list with 3 columns, and I'm searching to see if the second column matches 2 or 4, then returning the first column's element if so, and putting that into a function.
noOutliers((L1LeanList[order(L1LeanList[,1]),])[(L1LeanList[order(L1LeanList[,1]),2]==2)|
(L1LeanList[order(L1LeanList[,1]),2]==4),1])
when nothing matches the condition. I get a
Error in ((L1LeanList[order(L1LeanList[, 1]), ])[1, ])[(L1LeanList[order(L1LeanList[, :
incorrect number of dimensions
due to the fact that we effectively have List[List[all false]]
I can't just sub out something like L1LLSorted<-(L1LeanList[order(L1LeanList[,1]),]
and use L1LLSorted[,2] since this returns an error when the list is of length exactly 1
so now my code would need to look like
noOutliers(ifelse(any((L1LeanList[order(L1LeanList[,1]),2]==2)|
(L1LeanList[order(L1LeanList[,1]),2]==4)),0,
(L1LeanList[order(L1LeanList[,1]),])[(L1LeanList[order(L1LeanList[,1]),2]==2)|
(L1LeanList[order(L1LeanList[,1]),2]==4),1])))
which seems a bit ridiculous for the simple thing I'm requesting.
while writing this I realized that I can end up putting all this error checking into the noOutliers function itself so it looks like
noOutliers(L1LeanList,2,2,4) which will look much better, a necessity since slightly varying versions of this appear in my code dozens of times. I can't help but wonder, still, if theres a more elegant way to write the actual function.
for the curious, noOutliers finds a mean of the 30th-70th percentile in the sorted data set like so
noOutliers<-function(oList)
{
if (length(oList)<=20) return ("insufficient data")
cumSum<-0
iterCount<-0
for(i in round(length(oList)*3/10-.000001):round(length(oList)*7/10+.000001)+1)#adjustments deal with .5->even number rounding r mishandling
{ #and 1-based indexing (ex. for a list 1-10, taking 3-7 cuts off 1,2,8,9,10, imbalanced.)
cumSum<-cumSum+oList[i]
iterCount<-iterCount+1
}
return(cumSum/iterCount)
}
Let's see...
foo <- bar[(bar[,2]==2 | bar[,2]==4),1]
should extract all the first-column values you want. Then run whatever function you want on foo perhaps with the caveat "if (length(foo) < 1) then {exit, or skip, or something} "

Resources