This question already has an answer here:
Messy plot when plotting predictions of a polynomial regression using lm() in R
(1 answer)
Closed 6 years ago.
I am trying to fit a non linear function to a given set of data (x and y in code snippet), the function is defined as
f(x) = a/((sin((x-b)/2))^4)
x <- c(0, 5, -5, 10, -10, 15, -15, 20, -20, 25, -25, 30, -30)
y <- c(4.21, 3.73, 2.08, 1.1, 0.61, 0.42, 0.13, 0.1, 0.04, 0.036667, 0.016667, 0.007778, 0.007778)
plot(x,y, log="y")
This is how the initial graph on which I should fit before mentioned function looks like.
But when I try to fit using nls and plot the curve, the graph does not look quite right
f <- function(x,a,b) { a/((sin((x-b)/2))^4) }
fitmodel <- nls (y ~ f(x,a,b), start=list(a=1,b=1))
lines(x, predict(fitmodel))
This is what I see:
I am pretty sure I am doing something wrong here and will appreciate any help from you.
The R interpreter did exactly what you told it to do.
x is unsorted array.
Therefore, predict(fitmodel) make predictions for these unsorted points.
lines(x, predict(fitmodel)) connects the points in the given order. It connects (x[1], predict(fitmodel)[1]) to (x[2], predict(fitmodel)[2]) to (x[3], predict(fitmodel)[3]) etc. Since the points are not sorted by x, you see the picture in the graph.
You might do ind <- order(x); x <- x[ind]; y <- y[ind] as per Zheyuan Li'd suggestion.
Besides, your model makes no sense.
f <- function(x,a,b) { a/((sin((x-b)/2))^4) }
fitmodel <- nls (y ~ f(x,a,b), start=list(a=1,b=1))
For any a and b, f will be a periodic function with period 2π, while your x changes from -30 to 30 with step 5. You cannot reasonably approximate your points with such a function.
Related
location diffrence<-c(0,0.5,1,1.5,2)
Power<-c(0,0.2,0.4,0.6,0.8,1)
plot(location diffrence,Power)
The guy which has written the paper said he has smoothed the curve using a weighted moving average with weights vector w = (0.25,0.5,0.25) but he did not explained how he did this and with which function he achieved that.i am really confused
Up front, as #MartinWettstein cautions, be careful in when you smooth data and what you do with it (infer from it). Having said that, a simple exponential moving average might look like this.
# replacement data
x <- seq(0, 2, len=5)
y <- c(0, 0.02, 0.65, 1, 1)
# smoothed
ysm <-
zoo::rollapply(c(NA, y, NA), 3,
function(a) Hmisc::wtd.mean(a, c(0.25, 0.5, 0.25), na.rm = TRUE),
partial = FALSE)
# plot
plot(x, y, type = "b", pch = 16)
lines(x, ysm, col = "red")
Notes:
the zoo:: package provides a rolling window (3-wide here), calling the function once for indices 1-3, then again for indices 2-4, then 3-5, 4-6, etc.
with rolling-window operations, realize that they can be center-aligned (default of zoo::rollapply) or left/right aligned. There are some good explanations here: How to calculate 7-day moving average in R?)
I surround the y data with NAs so that I can mimic a partial window. Normally with rolling-window ops, if k=3, then the resulting vector is length(y) - (k-1) long. I'm inferring that you want to include data on the ends, so the first smoothed data point would be effectively (0.5*0 + 0.25*0.02)/0.75, the second smoothed data point (0.25*0 + 0.5*0.02 + 0.25*0.65)/1, and the last smoothed data point (0.25*1 + 0.5*1)/0.75. That is, omitting the 0.25 times a missing data point. That's a guess and can easily be adjusted based on your real needs.
I'm using Hmisc::wtd.mean, though it is trivial to write this weighted-mean function yourself.
This is suggestive only, and not meant to be authoritative. Just to help you begin exploring your smoothing processes.
So my question follows the development after my last one. I have been trying to work on getting the spike times as a rastor plot for a spike train. I took a firing rate of 100 and got spike train for 20 trials: The code for that is:
fr = 100
dt = 1/1000 #dt in milisecond
duration = 2 #no of duration in s
nBins = 2000 #SpikeTrain
nTrials = 20 #NumberOfSimulations
MyPoissonSpikeTrain = function(p, fr= 100) {
p = runif(nBins)
q = ifelse(p < fr*dt, 1, 0)
return(q)
}
set.seed(1)
SpikeMat <- t(replicate(nTrials, MyPoissonSpikeTrain()))
plot(x=-1,y=-1, xlab="time (s)", ylab="Trial",
main="Spike trains",
ylim=c(0.5, nTrials+1), xlim=c(0, duration))
for (i in 1: nTrials)
{
clip(x1 = 0, x2= duration, y1= (i-0.2), y2= (i+0.4))
abline(h=i, lwd= 1/4)
abline(v= dt*which( SpikeMat[i,]== 1))
}
This gives the result:
After all this was done, my next task was to get a vector of Inter-Spike intervals and get a histogram of them. Because the distribution of ISIs follows the exponential distribution, if I plot the exponential distribution of ISIs with the same data, it will match the curve made by the height of the histograms.
So to get the interspike timings first, I used:
spike_times <- c(dt*which( SpikeMat[i, ]==1))
Then to get a vector for interspike intervals and their histogram, I used the following command line,
ISI <- diff(spike_times)
hist(ISI, density= 10, col= 'blue', xlab='ISI(ms)', ylab='number of occurences')
and it gave me this plot:
Now, What I want is to plot the exponential distributions within the histograms that justifies the exponential distribution nature of the inter spike intervals. I am confused about what parameters to use and which rate to use. If somebody has worked with Interspike interval plotting, please help. And I am sorry if my data seems incomplete, please let me know if I am missing something.
My fellow researcher just told me a simple line of codes:
x <- seq(0, 0.05, length=1000)
y <- dexp(x, rate=100)
lines(x,y)
which gave me, this:
If somebody has any way of making this process more efficient, please help me.
I'm new in r and I would ask you all some help. I have x (value) and prob (it's probability) as follow:
x <- c(0.00, 1.08, 2.08, 3.08, 4.08, 4.64, 4.68)
prob <- c(0.000, 0.600, 0.370, 0.010, 0.006, 0.006, 0.006)
My aim is to contruct an estimate distribution graph based on those values. So far, I use qplot(x,prob,geom=c("point", "smooth"),span=0.55) to make it and it's shown here
https://i.stack.imgur.com/aVgNk.png
my question are:
Are there any other ways to contruct a nice distribution like that
without using qplot?
I need to retrieve the all the x values (i.e., 0.5, 1, 1.2, etc) and their corresponding prob values. Can can I do that?
I've been searching for a while, but with no luck.
Thank you all
If you're looking to predict the values of prob for given values of x, this is one way to do it. Note I'm using a loess prediction function here (because I believe it's the default for ggplot's smooth geom, which you've used), which may or may not be appropriate for you.
x <- c(0.00, 1.08, 2.08, 3.08, 4.08, 4.64, 4.68)
prob <- c(0.000, 0.600, 0.370, 0.010, 0.006, 0.006, 0.006)
First make a data frame with one column, I'll put a whole lot of data points into that column, just to make a bunch of predictions.
df <- data.frame( datapoints = seq.int( 0, max(x), 0.1 ) )
Then create a prediction column. I'm using the predict function, passing a loess smoothed function to it. The loess function is given your input data, and predict is asked to use the function from loess to predict for the values of df$datapoints
df$predicted <- predict( loess( prob ~ x, span = 0.55 ), df$datapoints )
Here's what the output looks like.
> head( df )
datapoints predicted
1 0.0 0.01971800
2 0.1 0.09229939
3 0.2 0.15914675
4 0.3 0.22037484
5 0.4 0.27609841
6 0.5 0.32643223
On the plotting side of things, ggplot2 is a good way to go, so I don't see a reason to shy away from qplot here. If you want more flexibility in what you get from ggplot2, you can code the functions more explicitly (as #Jan Sila has mentioned in another answer). Here's a way with ggplot2's more common (and more flexible) syntax:
plot <- ggplot( data = df,
mapping = aes( x = datapoints,
y = predicted ) ) +
geom_point() +
geom_smooth( span = 0.55 )
plot
you can get the observations once you specify the probability distribution.Have a look here. This will help you and walk you through MASS package.
..nicer graphs? I think ggplot is the best (also pretty sure that grapgh is from ggplot2). IF you want exacatly that, then you want a blue geom_line and on top of that add geom_point with the same mapping :) Try to have alook at tutorials, or we can help you out with that.
So, I'm trying to graph a bi-variate normal centered at 12.5 and 7.5, but it's always off centered. Really, I'm trying to sample from a bi-variate distribution and just graph it real fast to make sure it worked. Also, when I switch mu in dmvnorm to 50's it will work. I know that this is not the best code/graph, but it's formatted this why to try to figure out what happened. I don't know if the mistake is in the sampling or in the graphing. What is going on here?
library(mixtools)
x3<-seq(10,15,by=.05)
x4<-seq(5,10,by=.05)
y<-matrix(NA,nrow=length(x3)*length(x4),ncol=3)
dim(y)
counter<-1
for(i in seq(1,length(x3))){
for(j in seq(1,length(x4))){
#Change 12.5 and 7.5 to 50's will put it in the center
y[counter,]<-c(dmvnorm(y=c(i,j),mu=c(12.5,7.5),sigma=matrix(c(1000,0,0,1000),nrow=2))*1000,x3[i],x4[j])
counter<-counter+1
}
}
plot(y[,2],y[,3],pch=16,col=rgb(0,y[,1],0,maxColorValue=(dmvnorm(y=c(12.5,7.5),mu=c(12.5,7.5),sigma=matrix(c(1000,0,0,1000),nrow=2))*1000)),asp=1,xlim=c(min(y[,2]),max(y[,2])),ylim=c(min(y[,3]),max(y[,3])))
Some comments:
dmvnorm(y=c(i,j)) is wrong, you need to evaluate the density at c(x3[i], x4[j]).
I didn't read your plot() statement that carefully, but plot() only takes at most two variables (you have 3), you'll need to use something like image() or levelplot().
Here's what you are looking for:
library(lattice)
d <- expand.grid("x3" = seq(10, 15, .05), "x4" = seq(5, 10, .05))
d$dens <- dmvnorm(as.matrix(d), mu = c(12.5, 7.5), sigma = diag(1000, 2))
levelplot(dens ~ x3 * x4, data = d)
I am using the R package segmented to calculate parameters for a model, in which the response variable is linearly correlated with the explanatory variable until a breakpoint, then the response variable becomes independent from the explanatory variable. In other words, a segmented linear model with the second part having a slope = 0.
What I already did is:
linear1 <- lm(Y ~ X)
linear2 <- segmented (linear1, seg.Z = ~ X, psi = 2)
This gives a model that have a very good first line, but the second line is not horizontal (but not significant). I want to make the second line horizontal. (psi = 2 is the place where I observed a breakpoint.)
Also, when I use "abline" to show the broken line on the plotting, it only show the first part of the model, giving a warning: "only using the first two of 4 regression coefficients". How could I display both parts of the model?
To input my data into R:
X <- c(0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0)
Y <- c(1.31, 1.60, 1.86, 2.16, 2.44, 2.71, 3.00, 3.24, 3.57, 3.81, 3.80, 3.83, 3.78, 3.94, 3.75, 3.89)
This is as easy as using the plot method for segmented class objects provided by the package segmented and linked in the help for segmented
Assuming your data is in the data.frame d
linear2 <- segmented (linear1, seg.Z = ~ X, psi = 2, data = d)
plot(linear2)
points(Y~X, data = d)
An easy way to fudge a horizontal line would be to replace the coefficient with value required for that line to be horizontal
fudgedmodel <- linear2
fudgedmodel$coefficients[3] <- - fudgedmodel$coefficients[2]
plot(fudgedmodel)
points(Y~X, data = d)
Searching for the same thing and found a neat answer on this post from the R help mailing list:
https://stat.ethz.ch/pipermail/r-help/2007-July/137625.html
Here's an edited version of that answer that cuts straight to the solution:
library(segmented)
# simulate data - linear slope down until some point, at which slope=0
n<-50
x<-1:n/n
y<- 0-pmin(x-.5,0)+rnorm(50)*.03
plot(x,y) #This should be your scatterplot..
abline(0,0,lty=2)
# a parsimonious modelling: constrain right slope=0
# NB. This is probably what you want...
o<-lm(y~1)
xx<- -x
o2<-segmented(o,seg.Z=~xx,psi=list(xx=-.3))
slope(o2)
points(x,fitted(o2),col=2)
# now constrain \hat{\mu}(x)=0 for x>psi (you can do this if you know what the value of y is when x becomes independent)
o<-lm(y~0)
xx<- -x
o3<-segmented(o,seg.Z=~xx,psi=list(xx=-.3))
slope(o3)
points(x,fitted(o3),col=3)
You should get something like this. Red points are the first method, which sounds like the one for you. Green points are the second method, which only applies if you already know the value of y at which x becomes independent: