Find the first occurrence of a given column - multidimensional-array

In the documentation i have found a function findfirst which is capable of returning the index of the first element, which is equal to the given one.
In my case, i have a vector (or a one dimensional array) and i want to find the first column, which is equal to the vector.
I know how to do it the "hard" way: With findnext iterating over the first row, checking then the whole column. But is there a smarter way, which isn't obvious to me?

Suppose m is your matrix, and v is the vector.
Then:
findfirst(c->view(m,:,c)==v,1:size(m,2))
Should return 0 if the vector is not found and the column number if it is. Going down to basic element accesses might be faster, but this should also do the trick.

This is the way i go now. It does not feel very smart. I am somehow still confident that there is a better way, however i don't see it right now.
Coming from C/C++ and Python my look my look somewhat weird. I have no idea about good taste in julia. Suggestions are welcome.
function findfirstcolumn(A, v)
index = findfirst(A[1,:],v[1])
found = false
while index != 0 && found == false
found = true
for i = 2:size(v)[1]
if A[i,index] != v[i]
found = false
break
end
end
if found == true
return index
end
index = findnext(A[1,:], v[1], index+1)
end
return 0
end

Related

closed/fixed:Interpertation of basic R code

I have a basic question in regards to the R programming language.
I'm at a beginners level and I wish to understand the meaning behind two lines of code I found online in order to gain a better understanding. Here is the code:
as.data.frame(y[1:(n-k)])
as.data.frame(y[(k+1):n])
... where y and n are given. I do understand that the results are transformed into a data frame by the function as.data.frame() but what about the rest? I'm still at a beginners level so pardon me if this question is off-topic or irrelevant in this forum. Thank you in advance, I appreciate every answer :)
Looks like you understand the as.data.frame() function so let's look at what is happening inside of it. We're looking at y[1:(n-k)]. Here, y is a vector which is a collection of data points of the same type. For example:
> y <- c(1,2,3,4,5,6)
Try running that and then calling back y. What you get are those numbers listed out. Now, consider the case you want to just call out the number 1 in that vector. How would you do that? Well, this is where the brackets come into play. If you wanted to just call the number 1 in y:
> y[1]
[1] 1
Therefore, the brackets are a way of calling out or indexing specific items in the vector. Note that the indexing starts at the value 1 and goes up to the number of items in the vector, or length. One last thing before we go back to the example you gave. What if we want to index the numbers 1, 2, and 3 from the vector but not the rest?
> y[1:3]
[1] 1 2 3
This is where the colon comes into play. It allows us to reference a subset of the numbers. However, it will reference all the numbers between the index left of the colon and right of it. Try this out for yourself in R! Play around and see what happens.
Finally going back to your example:
y[1:(n-k)]
How would this work based on what we discussed? Well, the colon means that we are indexing all values in the vector y from two index values. What are those values? Well, they are the numbers to the left and right of the colon. Therefore, we are asking R to give us the values from the first position (index of 1) to the (n-k) position. Therefore, it's important to know what n and k are. If n is 4 and k is 1 then the command becomes:
y[1:3]
The same logic can apply to the second as.data.frame() command in your question. Essentially, R is picking out different numbers from a vector y and multiplying them together.
Hope this helps. The best way to learn R is to play around with a command, throw different numbers at it, guess what will happen, and then see what happens!

R function with variable args depending on presence/absence of other args

i've stumbled upon the varargs issue in R two or three times, but it seems that the problem i have is a little bit trickier than i expected. Here it is
i have a function, which does something with its variables, but i would like to introduce another variable, kind of a flag, that selects the way the function is working and which parameters are needed by the function itself: namely the number and type of inputs depends on a (flag) input.
Ok, an example is better:
example = function(x,flag=1,y){
if (flag) return(x)
else return(y)
}
and this is working fine.
The point is that in this example you need to specify both x and y every time. Instead I would like a function taking only x if flag=1 and only y if flag=0. (In this stupid example they basically would be two distinct functions, but in my actual case i have other (common) arguments on i do some calculations that both 'parts' of the functions need).
I know that one may specify whatever value for the unused argument and the result wouldn't change, but i want a function which is immediately readable by the user, and it is cumbersome to need to specify an argument which won't be used by the function
thank you for any help
What about the following.
example = function(x,flag=1,y){
if (flag && !missing(x)) return(x)
else if(!flag && !missing(y)) return(y)
}
This will check if the flag is 0 or non-zero plus it will check if an argument is missing. You may want to handle the case when neither of these is true cause this function will return NULL in that case.

R to ignore NULL values

I have 2 vector in R, but some of the values in both are marked as "NULL".
I want R to ignore "NULLS", but still "acknowledge" their presence because of indexes ( I´m using intersect and which function).
I have tried this:
for i in 1:length(vector)
if vector=="NULL"
i=i+1
else
'rest of the code'
Is this a good approach? The algorithm is running, but vector are very large.
You should change "NULL" for NA, which is R's native representation for NULL values. Then many functions have ways of dealing with NA values, such as na.action option... You shouldn't call your vector 'vector' since this is a reserved word for the class.
yourvector[yourvector == "NULL"] <- NA
Also you shouldn't add 1 to i in your if, just do nothing:
for (i in 1:length(yourvector)) {
if (!is.na(yourvector[i])) {
#rest of the code
}
}
Also tell what you wanna do. You probably don't need a for.
This code contains several errors:
First off, a vector cannot normally contain NULL values at all. Are you maybe using a list?
if vector=="NULL"
you probably mean if (vector[i] == "NULL"). Even so, that’s wrong. You cannot filter for NULL by comparing to the character string "NULL" – those two are fundamentally different things. You need to use the function is.null instead. Or, if you’re working with an actual vector which contains NA values (not NULL, like I said, that’s not possible), something like is.na.
i=i+1
This code makes no sense – leaving it out won’t change the result because the loop is in charge of incrementing i.
Finally, don’t iterate over indices – for (i in 1 : length(x)) is bad style in R. Instead, iterate over the elements directly:
for (x in vector) {
if (! is.na(x)) {
Perform action
}
}
But even this isn’t very R like. Instead, you would do two things:
use subsetting to get rid of NA values:
vector[! is.na(vector)]
Use one of the *apply functions (for instance, sapply) instead of a loop, and put the loop body into a function:
sapply(vector[! is.na(vector)], function (x) Do something with x)

Searching an ordered "list" matching condition when nothing matches the condition, list length = 1

I have a sorted list with 3 columns, and I'm searching to see if the second column matches 2 or 4, then returning the first column's element if so, and putting that into a function.
noOutliers((L1LeanList[order(L1LeanList[,1]),])[(L1LeanList[order(L1LeanList[,1]),2]==2)|
(L1LeanList[order(L1LeanList[,1]),2]==4),1])
when nothing matches the condition. I get a
Error in ((L1LeanList[order(L1LeanList[, 1]), ])[1, ])[(L1LeanList[order(L1LeanList[, :
incorrect number of dimensions
due to the fact that we effectively have List[List[all false]]
I can't just sub out something like L1LLSorted<-(L1LeanList[order(L1LeanList[,1]),]
and use L1LLSorted[,2] since this returns an error when the list is of length exactly 1
so now my code would need to look like
noOutliers(ifelse(any((L1LeanList[order(L1LeanList[,1]),2]==2)|
(L1LeanList[order(L1LeanList[,1]),2]==4)),0,
(L1LeanList[order(L1LeanList[,1]),])[(L1LeanList[order(L1LeanList[,1]),2]==2)|
(L1LeanList[order(L1LeanList[,1]),2]==4),1])))
which seems a bit ridiculous for the simple thing I'm requesting.
while writing this I realized that I can end up putting all this error checking into the noOutliers function itself so it looks like
noOutliers(L1LeanList,2,2,4) which will look much better, a necessity since slightly varying versions of this appear in my code dozens of times. I can't help but wonder, still, if theres a more elegant way to write the actual function.
for the curious, noOutliers finds a mean of the 30th-70th percentile in the sorted data set like so
noOutliers<-function(oList)
{
if (length(oList)<=20) return ("insufficient data")
cumSum<-0
iterCount<-0
for(i in round(length(oList)*3/10-.000001):round(length(oList)*7/10+.000001)+1)#adjustments deal with .5->even number rounding r mishandling
{ #and 1-based indexing (ex. for a list 1-10, taking 3-7 cuts off 1,2,8,9,10, imbalanced.)
cumSum<-cumSum+oList[i]
iterCount<-iterCount+1
}
return(cumSum/iterCount)
}
Let's see...
foo <- bar[(bar[,2]==2 | bar[,2]==4),1]
should extract all the first-column values you want. Then run whatever function you want on foo perhaps with the caveat "if (length(foo) < 1) then {exit, or skip, or something} "

R: Search a vector from end forward

A newbie question here. I have a vector v. I would like to search the vector from the end forward to find the last instance a condition is true. In matlab I would call find(condition, 1, 'last') and the search would start from the end of the vector and move forward. Is there an equivalent call in R?
For instance, I might like to know the last time v < v[length(v)]. I know that max(which(v<v[length(v)])) gives the correct answer. However speed is important, and it seems as if this first returns all the indices meeting the condition v
Generally in R it is preferred that you run a function "vectorized" on the entire vector, rather than in a loop that lets you stop as soon as a condition is true. However, the function rev will reverse a vector and might be handy for what you want to do.

Resources