How to spread points in boxplot in R? - r

I am working on data distribution which has following follwing points.
input<-read.table("infile",header=TRUE,sep="\t")
table(input)
0.786333 1 1.04453 1.06159 1.33277 1.53607 2.25893
49 938 1 1 36 16 166
if i plot box plot for it, i get single line for lowest datum, highest datum and median.
boxplot(input)
Is there any way to distribute points by normalization so that can have better boxplot with distinct boundary for lowest datum, highest datum and median?

You clearly have a biomodal distribution, I don't think a boxplot is a useful summary here
A density plot is more useful
plot(density(zz))
You could also consider a violin plot which is a bit of a mix between a kernel density plot and boxplot.
Using the vioplot package
library(vioplot)
violplot(zz)

Related

Creating a 3D Surface Plot from a matrix in R

I have been searching for this quite a while, but cannot find an answer to my problem or a minimum example. I would like to make a 3D-plot of a matrix.
An extract of my data looks like this. There are the years, which I would like to use as X-Axis. There is Y, which I would like to use as Y and I would like to plot z.
Year y z
2000 1 467
2000 2 10678
2000 2 25
...
How can I make this a surface plot?
Best
Have you tried searching for how to plot a surface plot in R? It turns out there's at least a persp function, a package called plot3D, wireframe in lattice and plotly.
For starters, try (from the plot3D package vignette)
library(plot3D)
example(persp3D)
example(surf3D)
example(slice3D)
example(scatter3D)
example(segments3D)
example(image2D)
example(image3D)
example(contour3D)
example(colkey)
example(jet.col)
example(perspbox)
example(mesh)
example(trans3D)
example(plot.plist)
example(ImageOcean)
example(Oxsat)

How to control Y-axis units using sm.density.compare()

I have used sm.density.compare to plot 3 density functions for data with values between -90 and +10. The Y axis is labeled "density" and has the range 0 - 1.0 as for proportions or probability.
I then plot 4 density functions for data with values between 0 and 1.0. I get a useful plot and the Y axis still reads "density" but the values are apparently counts and range between 0 and 12 or so.
The function sm.options does not seem to offer control of which you get. I'd like both to be probability or proportions.
I'm new to R but have a substantial history with other software.

Understanding what the kde2d z values mean?

I have two data sets that I am comparing using a ked2d contour plot on a log10 scale,
Here I will use an example of the following data sets,
b<-log10(rgamma(1000,6,3))
a<-log10((rweibull(1000,8,2)))
density<-kde2d(a,b,n=100)
filled.contour(density,color.palette=colorRampPalette(c('white','blue','yellow','red','darkred')))
This produces the following plot,
Now my question is what does the z values on the legend actually mean? I know it represents where most the data lies but 0-15 confuses me. I thought it could be a percentage but without the log10 scale I have values ranging from 0-1? And I have also produced plots with scales 1-1.2, 1-2 using my real data.
The colors represent the the values of the estimated density function ranging from 0 to 15 apparently. Just like with your other question about the odd looking linear regression I can relate to your confusion.
You just have to understand that a density's integral over the full domain has to be 1, so you can use it to calculate the probability of an observation falling into a specific region.

jfreechart xy series plot data points

In JFreechart xySeries I want to plot the lines using a very dense set of points in order to show curves with precision, however, I want to plot the points with less density. For example, I have 100 data points each one is 1 unit apart on the x axis, but I only want to plot the point every 5 unit. I do,however, want the lines to be connected every 1 unit in order to show the curve with high density.
Is this possible?
You can subclass XYLineAndShapeRenderer and override getItemShapeVisible(int series, int item).

Density plot in R, ggplot2

I am trying to plot and compare two sets of decimal numbers, between 0 and 1 using the R package, ggplot2. When I plotted using geom="density" in qplot, I noticed that the density curve goes past 1.0. I would like to have a density plot for the data that does not exceed the value range of the set, ie, all the area stays between 0 and 1.
Is it possible to plot the density between the values 0 and 1, without going past 1 or 0? If so, how would I accomplish this? I need the area of the two plots to be equal between 0 and 1, the range of the data.
Here is the code I used to generate the plots.
Right: qplot(precision,data = compare, fill=factor(dataset),binwidth = .05,geom="density", alpha=I(0.5))+ xlim(-1,2)
Left:qplot(precision,data = compare, fill=factor(dataset),binwidth = .05,geom="density", alpha=I(0.5))
You might consider using a different tool to estimate the density (the built in density functions do not consider bounds), then use ggplot2 to plot the estimated densities. The logspline package has tools that will estimate densities (useing a different algorythm than density does) and you can tell the functions that your density is bounded between 0 and 1 and it will take that into consideration in estimating the densities. Then use ggplot2 (or other code) to compare the estimated densities.

Resources