So I have plotted a curve, and have had a look in both my book and on stack but can not seem to find any code to instruct R to tell me the value of y when along curve at 70 x.
curve(
20*1.05^x,
from=0, to=140,
xlab='Time passed since 1890',
ylab='Population of Salmon',
main='Growth of Salmon since 1890'
)
So in short, I would like to know how to command R to give me the number of salmon at 70 years, and at other times.
Edit:
To clarify, I was curious how to command R to show multiple Y values for X at an increase of 5.
salmon <- data.frame(curve(
20*1.05^x,
from=0, to=140,
xlab='Time passed since 1890',
ylab='Population of Salmon',
main='Growth of Salmon since 1890'
))
salmon$y[salmon$x==70]
1 608.5285
This salmon data.frame gives you all of the data.
head(salmon)
x y
1 0.0 20.00000
2 1.4 21.41386
3 2.8 22.92768
4 4.2 24.54851
5 5.6 26.28392
6 7.0 28.14201
If you can also use inequalities to check the number of salmon in given ranges using the syntax above.
It's also simple to answer the 2nd part of your question using this object:
salmon$z <- salmon$y*5 # I am using * instead of + to make the plot more clear
plot(x=salmon$x,y=salmon$z, xlab='Time passed since 1890', ylab='Population of Salmon',type="l")
lines(salmon$x,salmon$y, col="blue")
curve is plotting the function 20*1.05^x
so just plug any value you want in that function instead of x, e.g.
> 20*1.05^70
[1] 608.5285
>
20*1.05^(seq(from=0, to=70, by=10))
Was all I had to do, I had forgotten until Ed posted his reply that I could type a function directly into R.
Related
I've been struggling to get a plot that shows my data accurately, and spent a while getting gap.plot up and running. After doing so, I have an issue with labelling the points.
Just plotting my data ends up with this:
Plot of abundance data, basically two different tiers of data at ~38,000, and between 1 - 50
As you can see, that doesn't clearly show either the top or the bottom sections of my plots well enough to distinguish anything.
Using gap plot, I managed to get:
gap.plot of abundance data, 100 - 37000 missed, labels only appearing on the lower tier
The code for my two plots is pretty simple:
plot(counts.abund1,pch=".",main= "Repeat 1")
text(counts.abund1, labels=row.names(counts.abund1), cex= 1.5)
gap.plot(counts.abund1[,1],counts.abund1[,2],gap=c(100,38000),gap.axis="y",xlim=c(0,60),ylim=c(0,39000))
text(counts.abund1, labels=row.names(counts.abund1), cex= 1.5)
But I don't know why/can't figure out why the labels (which are just the letters that the points denote) are not being applied the same in the two plots.
I'm kind of out of my depth trying this bit, very little idea how to plot things like this nicely, never had data like it when learning.
The data this comes from is originally a large (10,000 x 10,000 matrix) that contains a random assortment of letters a to z, then has replacements and "speciation" or "immigration" which results in the first lot of letters at ~38,000, and the second lot normally below 50.
The code I run after getting that matrix to get the rank abundance is:
##Abundance 1
counts1 <- as.data.frame(as.list(table(neutral.v1)))
counts.abund1<-rankabundance(counts1)
With neutral.v1 being the matrix.
The data frame for counts.abund1 looks like (extremely poorly formatted, sorry):
rank abundance proportion plower pupper accumfreq logabun rankfreq
a 1 38795 3.9 NaN NaN 3.9 4.6 1.9
x 2 38759 3.9 NaN NaN 7.8 4.6 3.8
j 3 38649 3.9 NaN NaN 11.6 4.6 5.7
m 4 38639 3.9 NaN NaN 15.5 4.6 7.5
and continues for all the variables. I only use Rank and Abundance right now, with the a,x,j,m just the variable that applies to, and what I want to use as the labels on the plot.
Any advice would be really appreciated. I can't really shorten the code too much or provide the matrix because the type of data is quite specific, as are the quantities in a sense.
As I mentioned, I've been using gap.plot to just create a break in the axis, but if there are better solutions to plotting this type of data I'd be absolutely all ears.
Really sorry that this is a mess of a question, bit frazzled on the whole thing right now.
gap.plot() doesn't draw two plots but one plot by decreasing upper section's value, drawing additional box and rewriting axis tick labels. So, the upper region's y-coordinate is neither equivalent to original value nor axis tick labels. The real y-coordinate in upper region is "original value" - diff(gap).
gap.plot(counts.abund1[,1], counts.abund1[,2], gap=c(100,38000), gap.axis="y",
xlim=c(0,60), ylim=c(0,39000))
text(counts.abund1, labels=row.names(counts.abund1), cex= 1.5)
text(counts.abund1[,1], counts.abund1[,2] - diff(c(100, 38000)), labels=row.names(counts.abund1), cex=1.5)
# the example data I used
set.seed(1)
counts.abund1 <- data.frame(rank = 1:50,
abundance = c(rnorm(25, 38500, 100), rnorm(25, 30, 20)))
So I've been working on a scatter plot for some data that I have. I used to be able to get the scatter plot function to work, but now I can't and I don't understand what my error is. My data looks has 5 values and a column that assigns each to a cluster (I used k-means in this particular case).
closedmi uncertin certknow sourknow justknow fit3.cluster
1 3.166667 6.125 2.571429 4.500 3.375 1
2 3.666667 4.250 3.428571 4.000 4.750 2
3 1.833333 5.750 1.428571 3.375 2.125 2
4 3.500000 4.500 1.857143 4.250 3.125 3
I'm looking to try to plot my data in 3 dimensions using the first three principle components and see the clusters. Here is my code to find the principal components, and then attach the cluster column to the principle components into a new data frame.
#Find the 5 principal components of the data matrix
pcdf <- princomp(pre2, cor=T, score=T)
pre4 <- data.frame(pcdf$scores, cluster=fit3$cluster)
#Making a 3D plot of the Solution
scatter3d(pre4$Comp.1, pre4$Comp.2, pre4$Comp.3, groups=pre4$cluster,
surface=FALSE, grid=FALSE, ellipsoid=TRUE)
So then try to use scatter3d to plot the individuals using the cluster column as a grouping factor and I end up with an error. I've been using this source for the code to get the right syntax, but I still end up with the error.
Error in scatter3d.default(pre4$Comp.1, pre4$Comp.2, pre4$Comp.3, groups = pre4$cluster: groups variable must be a factor
but it is. It's in the data frame, I can call the column using pre4$cluster. Is there some formatting or syntax error I can't see? Am I just going mad?
I was able to get this to work just last week and now I'm not able to. I know I can use plot3d to get the visualization, but I like the visualization better using scatter3d and would like to be able to use it.
Try this:
scatter3d(pre4$Comp.1, pre4$Comp.2, pre4$Comp.3, groups=as.factor(pre4$cluster),
surface=FALSE, grid=FALSE, ellipsoid=TRUE)
That will solve the error message regarding factors. Beyond that, just make sure that your leading minor is positive definite.
I have some measured data, experiment.dat which goes like this:
1 2
2 3
Now I want to plot them via some command line
plot "experiment.dat" using 1:2 title "experiment" with lines lw 3
Is there some way how to scale the different lines with some scaling factor like -1?
Yes, you can do any kind of calculations inside the using statement. To scale the y-value (the second column) with -1, use
plot "experiment.dat" using 1:(-1*$2)
You don't need to multiply the column by minus one, you can simply use:
p "experiment.dat" u 1:(-$2)
at least with Version 5.4 works fine.
You can also only use the initial letter of every command.
I drew a dotplot (using dotPlot() from seqinr package) of 2 fasta sequences and I need to extract some values (x,y) from the plot.
The Dotplot() output is an image
A generic dotplot maybe be this one
I need for example the values of start & end of the local alignment which are represented by the purple lines
so here an example
l=30
seq1 <- paste(sample(c("A","G","T","C"), l, repl=TRUE))
seq2 <- paste(sample(c("A","G","T","C"), l, repl=TRUE))
dotPlot(seq1,seq2, wsize = 2, wstep = 1, nmatch = 2, col = c("white", "green"), xlab = deparse(substitute(seq1)), ylab = deparse(substitute(seq2)))
locator(n=2, type="p")
$x
[1] 27.18720 31.23263
$y
[1] 20.45222 24.65726
So I want exactly the position of the 2 circled points,and as you can see the locator() gives decimal value .
I may use ceiling() or round() but i maybe get back an approximation error
I need the integer value of the point I clicked on, basically the nearest point to the place
Would be perfect to use identify(), which works with "normal" plots and gives back a vector with the closest plotted value to your "click", but it doesn't work on the dotPlot() output (the problem seems to be that it doesn't work on image output as locator() )
Any possible solution would be welcome, including using dotter in shell or python. Thanks
As you have mentioned Identify doesn't work since it need a plot not an image. Maybe a solution is to call image after plot(type="n",..) but this need to change the dotPlot function source code. Another elegant solution is to use lattice package and panel.identify the grid equivalent of identify.
Here an example, where I select some points ( 6 -> 15):
library(lattice)
dotplot(y~x,data.frame(x=letters,y=letters))
trellis.focus("panel", 1, 1)
> panel.identify()
[1] 6 7 8 9 10 11 12 13 14 15
Have a look at evolvedmicrobe/dotplot on github
https://github.com/evolvedmicrobe/dotplot/blob/master/R/plotters.R
It provides mkDotPlotDataFrame. With this you can better get coordinates between matches, like with identify.
I have a data frame with three columns and I'd like to make a image/heatmap of the data.
The three columns are pe, vix, and ret with pe and vix being x and y and ret being z.
There are 220 lines in the data frame so i'd like to bin the data if possible, the ranges are below.
Any suggestions for how to bin the x and y data and also create a matrix for use in an image()?
> range(matr$pe)
[1] 13.32 44.20
> range(matr$vix)
[1] 10.42 59.89
> range(matr$ret)
[1] -0.09274936 0.04693118
> class(matr)
[1] "data.frame"
> head(matr)
pe vix ret
1 20.86 13.16 -0.002931561
2 20.46 12.53 -0.003546889
3 20.52 12.42 0.006339165
4 20.61 13.47 0.009683174
5 20.57 11.26 -0.002666668
6 20.81 11.73 0.002895003
Here's what I ended up doing. I used the interp() function in the akima package to create the appropriately binned matrix object. It seems to do the work of binning and 'matricizing' of the data frame. On a side note, in order to make the heatmap WITH a legend, I ended up using the image.plot() method from the fields package. Here's the
code:
par(bg = 3)
image.plot(s,xlab="P/E Ratio", ylab="VIX",
main="Contour Map of SPY Returns vs P/E Ratio and Vix")
abline(v=(seq(0,100,5)), col=6, lty="dotted")
abline(h=(seq(0,100,5)), col=6, lty="dotted")
contour(s, add=TRUE)
and resulting product for anyone interested:
Thanks to everyone for their help and suggestions.
You could use e.g. cutlike this:
matr$binnedpe<-cut(matr$pe, breaks=10)
matr$binnedvix<-cut(matr$vix, breaks=10)
Next you can use e.g. ddply (from package plyr) to get the means per bin:
binneddata<-ddply(matr, .(binnedpe, binnedvix), function(d){c(d$binnendpe, d$binnedvix, mean(d$ret))})
Finally, you use this last data.frame to draw your heat map. I haven't tested any of the above, but it should be close enough to get you going.
you should take a spin through the raster package. In particular, the function rasterfromXYZ() should do most of what you want. It's pretty easy, either with the base graphics tools or the raster package, to setup a 'heatmap' color range for the raster object.