Return peak heights in find_peaks - multidimensional-array

I am using scipy.signal.find_peaks in a 8x150 array "signal" to find relative minima.
for example for row 1, I use
peaks = find_peaks(signal[1,:],distance=8,height=-1.6)
This realiably gives me the indices in a ndarray that has the indices and the peak_heights as a property arrary.
Now I would like to return the peak heights in a list or so, so that I can save them for later use.
I have tried
signal[1,peaks]
but that gives me an index error.
How do I convert peaks to a proper indice? Or is there a way to directly access the peak_heigts from find_peaks?

I just found that signal[0] seems to do the job!
Thanks!

Related

Calculating just a single row of dissimilarity/distance matrix

I have a data-frame with 30k rows and 10 features. I would like to calculate distance matrix like below;
gower_dist <- daisy(data-frame, metric = "gower"),
This function returns whole dissimilarity matrix. I want to get just the first row.
(Just distances of the first element in data-frame). How can I do it? Do you have an idea?
You probably need to get the source and extend it.
I suggest you extend the API by adding a second parameter y that defaults to x. Then the method should return the pairwise distances of each element in x to each element in y.
Fortunately, R is GPL open source, so this is easy.
This would likely be a welcome extension, you should submit it to the package authors for inclusion.

Unrecognized index variable [i] in R for-loop

I scripted a simple for-loop to iterate over each row of a data set to calculate the distance between two coordinates. The code uses the 'geosphere' package and the 'distm' function which takes two sets of coordinates and returns the distance in meters (which I convert to miles by multiplying by 0.00062137).
Here is my loop:
##For loop to find distance in miles for each coordinate pair
miles <- 0
for (i in i:3303) {
miles[i] <- distm(x = c(clean.zips[i,4], clean.zips[i,3]), y = c(clean.zips[i,7], clean.zips[i,6]))[,1] * 0.00062137
}
However, when I run it I receive an error:
Error: object 'i' not found
The thing is, I've run this code before and it worked. Other times, I get this error. I'm not changing any code, it just seems to randomly work only some of the times. I feel the loop must be constructed correctly if it does what I want on occasion, but why would it only work sometimes?
OK, I'm not certain what justifies the down votes on this, but guess I apologize to whomever thought that necessary.
The issue seems to have just been starting the indexing with an actual numeric value like Zheyuan suggested (i.e. using '1:3303' rather than 'i:3303'). I feel like I've created loops before using 'i in i:xxx' without first defining 'i' but maybe not. Anyway, it's solved and thank you!

How to use apply with a function that required 2 parameters

I looked at the existing posts but could not get a clear answer... I have a data frame and I would like to modify each data by a calculation that takes into account the min and max of each lines.
I would like to use apply associated to a function:
sc=function(x,seg) {(x-seg[2])*100/(seg[1]-seg[2])}
or
sc=function(x,a,b) {(x-b)*100/(a-b)}
where x is a line of the data frame and seg=c(a,b) calculated as follow
d=dim(data) ## data is my dataframe
for (i in (1:d[1])) ## the calculation has to be done for each line, according
## the min and max of the specific line
{
seg=c(max(data[i,]),min(data[i,]))
data[i,]=apply(data[i,],1,sc)
return(data)
}
This does not work, obviously, because I do not know how to tell apply that it needs to take into account more than one parameter...
There is probably a R function that does this specific calculation, but since I am a R beginner, I would really appreciate to understand how to create such coding.
Thanks for the help!
Stéphane
Update:
Here is what I found for a solution, but it does not sound completely logical to me...
for (i in (1:d[1])) {
t=apply(data,2,sc,seg=range(data[i,]))
data[i,]=t[i,] }
The third parameter you pass to apply should be a function. Also, there's no reason to loop when you use apply.
apply(d,1,function(x) c(min(x), max(x)))
will return a 2-row matrix with the min and max values for each row. Although there is a build in function to get min/max called `range
apply(d,1,range)

Bucketing data in R

I'm trying to make a function that determines what bucket a certain value goes into based off of a given vector. So my function has two inputs: a vector determining the break points for the bucket
(ex: if the vector is (1,4,5,10) the buckets would be <=1, 110)
and a certain number. I want the function to output a certain value determining the bucket.
For example if I input .9 the output could be 1, 1.6, the output could be 4, 5.8 the output could be 10, and 13, the output could be "10+".
The way I'm doing it right now is I first check if the input number is bigger than the vector's largest element or smaller than the vector's smallest element. If not, I then run a for loop (can't figure out how to use apply) to check if the number is in each specific interval. The problem is this is way too inefficient because I'm dealing with a large data set. Does anyone know an efficient way to do this?
The cut() function is convenient for bucketing: cut(splitme,breaks=vectorwithsplits) .
However, it looks like you're actually trying to figure out an insertion point. You need something like binary search.

Function return value changes if use local variable

I have two snippets of code which I would have expected to behave the same, but they don't:
position <- function(t) {
coordinates <- c(cosh(t), sinh(t))
return(coordinates[1])
}
and
position <- function(t) {
coordinates <- c(cosh(t), sinh(t))
return(cosh(t))
}
I use the function position to plot a curve. With the first snippet the curve is not plotted. With the second snippet the curve is plotted.
What is the functional difference between the two snippets, and why?
What gets returned will depend on the type of argument passed. If the argument "t" is a matrix as might be expected for a function designed to deal with coordinates, than a matrix is returned from cosh(t) and from sinh(t).
The first function would only return the first element of a matrix formed and then "straightened out" as the c function caused it to loose dimensions. If you wanted to preserve the matrix character, then use rbind or cbind depending on what would be the next function to process the data.
The second function would first calculate "coordinates" and then let it disappear into the garbage collector since it returns the matrix formed by cosh(t) instead.
You will not be able to get a better answer since you are at the moment making us all guess about what sort of data structure you are passing to the function. You should post the results of dput() on your argument to this function. And you should tell us what the help page for the plotting function expects as an argument type.
The result of
coordinates <- c(cosh(t), sinh(t))
is a numeric vector of length 2 * length(t).
The command
return(coordinates[1])
returns only the first value of this vector. (The result of coordinates[1] and cosh(t) are only identical if length(t) == 1.) To return the result of cosh(h), you could index coordinates with a sequence based on the length of t:
coordinates <- c(cosh(t), sinh(t))
return(coordinates[seq_along(t)])
Use double brackets in your first example.
coordinates[[1]]
As a useful tip when troubleshooting, if you explore the output of your two functions using str(position(x)) for your two different functions, you should see the difference.
Try also
str(vec[1])
str(vec[[1]])

Resources