I tried playing around with this example in the Julia documentation. My attempt was to make the cell split into two parts that have half of the amount of protein each, so I set Theta=0.5. However, the plot looks like this:
It is obvious that the number of cells doubles every time they hit the target amount of protein, at the same time, since they are equal. How could I plot this? I also don't understand why the number of cells stops at 3 in the case below.
Plot the protein amount in each cell and think about the model you've created. After the first division, both cells have the same value. So at exactly the same time, you have an event fire. The "maximum" (whichever index is lower, so 1) will split, while 2 will keep growing above 1. But now that u[2] > 1, the rootfinding condition 1-maximum(u) will never hit zero again, and thus no more splits will occur. This means you'll have two splits total, i.e. 3 cells.
Remember, programs will do exactly what you tell them to. I assume that what you meant was, as your effect, split any cells that are greater than or equal to 1. If that's the affect! that you wanted, then you'd have to write it:
function affect!(integrator)
u = integrator.u
idxs = findall(x->x>=1-eps(eltype(u)),u)
resize!(integrator,length(u)+length(idxs))
u[idxs] ./ 2
u[end-idxs:end] = 0.5
nothing
end
would be one way to do it, and of course there are many others.
Related
I need some help for this problem ive been facing
suppose I have an array=[3,4,1,5,6,1,3]
now I need the permutation that the duplicate element 3 should not sit beside other 3 and same for 1.
how am I suppose to solve this ive watched a ton of YouTube and googled it but no luck
for the help thanks in advance.,,,
Are you looking for a general case solution or just for that particular array? If you are looking for more general case, I think you should specify the restrictions or the problem becomes too complex. Same applies for if you want to write a code. Some languages (like Python) have libraries that makes these works relatively simple, but the time complexity can get ugly.
Here's mathmatical approached to the problem:
Step 1: Suppose all the elements are different a = [3,4,5,6,1]
In this case we will have 5! different options (You have 5 options to choose the first element and 4 options to choose the second and so on)
Step 2: Suppose you have one repeated element a = [1,3,4,5,6,1]
In this case we have 6!/2! different options (6! comes from Step 1 and we divide it by 2! because if you switch the position of repeated element to itself the array does not chance).
Now you want to exclude options where repeated elements appear next to each other. The trick is to treat them as one element. So now we have a = [(1,1), 3, 4, 5, 6]. There 5! different options. We subtract this from total, that is 6!/2! - 5! will give you the answer.
Step 3: (your case) Two repeated elements a = [3,4,1,5,6,1,3]
We continue with the same logic. In total we have 7!/(2!x2!) options. From Step 2, if we want to exclude cases where 1 appear next to 1 then we will have to substract 6! from total. Also we have 3 that appears twice too. So, we will subtract another 6! from total. Unfortunately, we have subtracted some cases twice (can you guess which). If we find which cases we subtracted twice and add them we will get the answer.
The cases that we subtracted twice are when 1 comes after 1 at the same time 3 comes after 3, that is a = [(1,1),4,5,6,(3,3)]. We have subtracted those options for both one and three. There are 5! cases like that (can you guess why?).
To some it up get 7!/(2!x2!) - 2x6! + 5!.
If you are not looking for general solution those numbers are not big so you can write a bruteforce code (To save some time/space convert array to string).
I might have missed something in calculations but if you follow the logic you will get the answer. Also, if you want to understand why those things work try it with small data to get the intuition. If you need code, let me know. I will update the solution.
I'm creating a Monte Carlo model using R. My model creates matrices that are filled with either zeros or values that fall within the constraints. I'm running a couple hundred thousand n values thru my model, and I want to find the average of the non zero matrices that I've created. I'm guessing I can do something in the last section.
Thanks for the help!
Code:
n<-252500
PaidLoss_1<-numeric(n)
PaidLoss_2<-numeric(n)
PaidLoss_3<-numeric(n)
PaidLoss_4<-numeric(n)
PaidLoss_5<-numeric(n)
PaidLoss_6<-numeric(n)
PaidLoss_7<-numeric(n)
PaidLoss_8<-numeric(n)
PaidLoss_9<-numeric(n)
for(i in 1:n){
claim_type<-rmultinom(1,1,c(0.00166439057698873, 0.000810856947763742, 0.00183509730283373, 0.000725503584841243, 0.00405428473881871, 0.00725503584841243, 0.0100290201433936, 0.00529190850119495, 0.0103277569136224, 0.0096449300102424, 0.00375554796858996, 0.00806589279617617, 0.00776715602594742, 0.000768180266302492, 0.00405428473881871, 0.00226186411744623, 0.00354216456128371, 0.00277398429498122, 0.000682826903379993))
claim_type<-which(claim_type==1)
claim_Amanda<-runif(1, min=34115, max=2158707.51)
claim_Bob<-runif(1, min=16443, max=413150.50)
claim_Claire<-runif(1, min=30607.50, max=1341330.97)
claim_Doug<-runif(1, min=17554.20, max=969871)
if(claim_type==1){PaidLoss_1[i]<-1*claim_Amanda}
if(claim_type==2){PaidLoss_2[i]<-0*claim_Amanda}
if(claim_type==3){PaidLoss_3[i]<-1* claim_Bob}
if(claim_type==4){PaidLoss_4[i]<-0* claim_Bob}
if(claim_type==5){PaidLoss_5[i]<-1* claim_Claire}
if(claim_type==6){PaidLoss_6[i]<-0* claim_Claire}
}
PaidLoss1<-sum(PaidLoss_1)/2525
PaidLoss3<-sum(PaidLoss_3)/2525
PaidLoss5<-sum(PaidLoss_5)/2525
PaidLoss7<-sum(PaidLoss_7)/2525
partial output of my numeric matrix
First, let me make sure I've wrapped my head around what you want to do: you have several columns -- in your example, PaidLoss_1, ..., PaidLoss_9, which have many entries. Some of these entries are 0, and you'd like to take the average (within each column) of the entries that are not zero. Did I get that right?
If so:
Comment 1: At the very end of your code, you might want to avoid using sum and dividing by a number to get the mean you want. It obviously works, but it opens you up to a risk: if you ever change the value of n at the top, then in the best case scenario you have to edit several lines down below, and in the worst case scenario you forget to do that. So, I'd suggest something more like mean(PaidLoss_1) to get your mean.
Right now, you have n as 252500, and your denominator at the end is 2525, which has the effect of inflating your mean by a factor of 100. Maybe that's what you wanted; if so, I'd recommend mean(PaidLoss_1) * 100 for the same reasons as above.
Comment 2: You can do what you want via subsetting. Take a smaller example as a demonstration:
test <- c(10, 0, 10, 0, 10, 0)
mean(test) # gives 5
test!=0 # a vector of TRUE/FALSE for which are nonzero
test[test!=0] # the subset of test which we found to be nonzero
mean(test[test!=0]) # gives 10, the average of the nonzero entries
The middle three lines are just for demonstration; the only necessary lines to do what you want are the first (to declare the vector) and the last (to get the mean). So your code should be something like PaidLoss1 <- mean(PaidLoss_1[PaidLoss_1 != 0]), or perhaps that times 100.
Comment 3: You might consider organizing your stuff into a dataframe. Instead of typing PaidLoss_1, PaidLoss_2, etc., it might make sense to organize all this PaidLoss stuff into a matrix. You could then access elements of the matrix with [ , ] indexing. This would be useful because it would clean up some of the code and prevent you from having to type lots of things; you could also then make use of things like the apply() family of functions to save you from having to type the same commands over and over for different columns (such as the mean). You could also use a dataframe or something else to organize it, but having some structure would make your life easier.
(And to be super clear, your code is exactly what my code looked like when I first started writing in R. You can decide if it's worth pursuing some of that optimization; it probably just depends how much time you plan to eventually spend in R.)
I apologize ahead of time for the crude way this question is worded. I was under the impression for the longest time that what I'm trying to do is called "Normalizing data" but after googling to try and find the method to do this, I seem to be mistaken so I'm not sure exactly what it's called that I'm trying to do (bear with me please).
I have a set of data like this:
0.17407
0.05013
0.08520
0.02892
0.02986
0.06286
0.04453
0.00425
0.20470
0.02267
0.01470
0.02460
0.01735
0.01069
0.02168
0.13912
0.02004
0.02018
0.07837
When you add them all you get 1.05392.
I'd like to "adjust" the data set so that the relative values all remain the same but the sum is equal to 1. When I googled normalizing data sets, I found a formula like this:
(x-min(x))/(max(x)-min(x))
However, this simply "ranks" each data point as a certain percentage of the maximum value so that your max value in your data set is equal to 1 and the minimum, 0.
Extra: Could someone enlighten me what this is called if not normalizing data. Obviously I've been carrying around this ignorant belief for far too long.
If you want your data to sum to 1 you normalize your data. You normalize by dividing by the sum of you series (sum_i x_i, where x_i are the elements of your data series).
The formula you mention is another possible rescaling, but as you observed it has a different effect. Note that in the first case you map x -> c*x (in your case: x -> 1/1.05392*x), while the second case rescales with x -> c*x + offset. Note also, that the later is not linear (unless min(x) = 0), that is f(x+y) != f(x) + f(y).
If your whole confusion is about the naming of things, than I would not worry to much. After all there is only convention and common agreement, but no absolute truth/authority. And the terms are reused in different fields, cf. Normalization on Wikipedia:
Normalization or normalisation refers to a process that makes something more normal or regular
n <- length(rle(sign(z)))
z contains 1 and -1. n should indicate the number of how many times the sign of z changes.
The code above does not lead to the desired outcome. If I expand the command to
length(rle(sign(z))[[1]])
it works. I don't understand the underlying mechanism of how [[1]] solves the problem?
rle returns a list consisting of two components: lengths, and values. As such, its own length is always 2. By contrast, you want to know the length of either of those components (they obviously have the same length). So either length(rle(…)[[1]]) or length(rle(…)[[2]]) would work. Better to use the names instead of an index though, e.g.
length(rle(z)$lengths)
However, this won’t be the number of times the sign changes; rather, it will be the number of times the changes plus 1.
I want to analyse angles in movement of animals. I have tracking data that has 10 recordings per second. The data per recording consists of the position (x,y) of the animal, the angle and distance relative to the previous recording and furthermore includes speed and acceleration.
I want to analyse the speed an animal has while making a particular angle, however since the temporal resolution of my data is so high, each turn consists of a number of minute angles.
I figured there are two possible ways to work around this problem for both of which I do not know how to achieve such a thing in R and help would be greatly appreciated.
The first: Reducing my temporal resolution by a certain factor. However, this brings the disadvantage of losing possibly important parts of the data. Despite this, how would I be able to automatically subsample for example every 3rd or 10th recording of my data set?
The second: By converting straight movement into so called 'flights'; rule based aggregation of steps in approximately the same direction, separated by acute turns (see the figure). A flight between two points ends when the perpendicular distance from the main direction of that flight is larger than x, a value that can be arbitrarily set. Does anyone have any idea how to do that with the xy coordinate positional data that I have?
It sounds like there are three potential things you might want help with: the algorithm, the math, or R syntax.
The algorithm you need may depend on the specifics of your data. For example, how much data do you have? What format is it in? Is it in 2D or 3D? One possibility is to iterate through your data set. With each new point, you need to check all the previous points to see if they fall within your desired column. If the data set is large, however, this might be really slow. Worst case scenario, all the data points are in a single flight segment, meaning you would check the first point the same number of times as you have data points, the second point one less, etc. The means n + (n-1) + (n-2) + ... + 1 = n(n-1)/2 operations. That's O(n^2); the operating time could have quadratic growth with respect to the size of your data set. Hence, you may need something more sophisticated.
The math to check whether a point is within your desired column of x is pretty straightforward, although maybe more sophisticated math could help inform a better algorithm. One approach would be to use vector arithmetic. To take an example, suppose you have points A, B, and C. Your goal is to see if B falls in a column of width x around the vector from A to C. To do this, find the vector v orthogonal to C, then look at whether the magnitude of the scalar projection of the vector from A to B onto v is less than x. There is lots of literature available for help with this sort of thing, here is one example.
I think this is where I might start (with a boolean function for an individual point), since it seems like an R function to determine this would be convenient. Then another function that takes a set of points and calculates the vector v and calls the first function for each point in the set. Then run some data and see how long it takes.
I'm afraid I won't be of much help with R syntax, although it is on my list of things I'd like to learn. I checked out the manual for R last night and it had plenty of useful examples. I believe this is very doable, even for an R novice like myself. It might be kind of slow if you have a big data set. However, with something that works, it might also be easier to acquire help from people with more knowledge and experience to optimize it.
Two quick clarifying points in case they are helpful:
The above suggestion is just to start with the data for a single animal, so when I talk about growth of data I'm talking about the average data sample size for a single animal. If that is slow, you'll probably need to fix that first. Then you'll need to potentially analyze/optimize an algorithm for processing multiple animals afterwards.
I'm implicitly assuming that the definition of flight segment is the largest subset of contiguous data points where no "sub" flight segment violates the column rule. That is to say, I think I could come up with an example where a set of points satisfies your rule of falling within a column of width x around the vector to the last point, but if you looked at the column of width x around the vector to the second to last point, one point wouldn't meet the criteria anymore. Depending on how you define the flight segment then (e.g. if you want it to be the largest possible set of points that meet your condition and don't care about what happens inside), you may need something different (e.g. work backwards instead of forwards).