Determine the proportion of the data information in r - r

Suppose i have a plot like the following:
I want to get the portion of the data where the majority (say 90%) of the data lay, for example, i want to isolate the plot into something like:
in which the points lay in the black frame contributes to (90%) of the data.
How can i do this in R?
Edited for comment:
What if i have the following plot:? the majority part probably start from 0.

Related

Use R to impose bubble data on an image

I want to take an image, like this:
And I want to superimpose data bubbles on it, so it would look like this:
I want to do this in R (because I know R)
Let's imagine the bubbles represent IQ (or whatever) and I have the relevant numbers in a spreadsheet next to the species. How do I get ggplot (let's say) to know where on the image to put the bubbles?
Thank you!

How to create bins in a reliability diagram

I created a logistic/logit model with a binomial response variable using
model <- glm(response~predictor1+predictor2+...)
and then I used the predict function to create a new data frame
outcome <-data.frame(predict(model,newdata=IndependentDataSet,type="response"),as.numeric(as.character(Independent$ResponseVariable)))
names(outcome) <- c("Pr","Obs")
I can use one of the following functions
plot(verify(data$obs,data$pr),CI=TRUE)
attribute(verify(data$obs,data$pr))
to create a plot that looks like this
or
reliability.plot(verify(data$obs,data$pr))
from
library(verification)
to create a reliability diagram. I am wondering how I can separate the bins based on specific values. For example, the model that I am evaluating is based around a climatology of 19% (0.19) and I want there to be a bin at (1/3)*climatology, climatology, and go up by (2/3) of climatology for the proceeding bins. How can I do this?
Additionally, I have seen the bins represented as circles that are proportional in size to the percent of the data that is at that bin. Does anyone know how to make a more aesthetically pleasing reliability diagram in R? Any recommendations are welcome.
This is how I would like my diagrams to appear
The easiest could be using
trace("attribute.default",edit=TRUE)
or whichever other function.
In this way, you access the source code and edit it. These changes affect only the current R session.

Select multiple points on scatterplot, save selection to new table

I have a very large data set (~250,000 records) that I have used to create a linear model. I have plotted predicted vs. actual
.
I tried to use identify() to select the two cluster of values near the center of the graph and coord() to identify them. There are a few problems here: 1)There are many, many more points in those clusters than I can click on and identify individually, and 2)I need to know ALL of them, select all of them somehow with out selecting any others, and subset my data to just those points.
This model was created using a satellite image paired with ancillary spatial data. Each entry in the table corresponds to a particular point on the map. I need to identify where these two clusters are located on the map. My data frame includes the FID (which I can use to link back to the map), the original predictor, the response, and my predicted values.
I appreciate any help!

Plot map of points spread into grid sub-samples

I'm trying to generate a map that looks like this in R:
The boxes represent individual observations, while the colors represent data pertaining to those individual observations. Anyone have any idea how this might be accomplished?

How to plot graph when interval and data is given

Suppose i have obtained some data which is like this :-
Size-Range Percentage
[1-3] 2%
[3-8] 6%
[8-20] 10%
[20-50] 30%
[50-100] 80%
[100-200] 99.99%
Suppose i run an algorithm with many data files and i got this output.
1st column shows the time of algorithm and 2nd column shows the percentile of data processed.
I just want to plot this data.
I want to draw some graph.
Please suggest me how can i do this using gnuplot or any other tool.
You can draw a histogram or a barplot in R (more details about it here). For a good tutorial in this direction, please see Producing Simple Graphs with R.
For example, suppose you have the data in a CSV file which looks like this:
Lo,Hi,Percentage
1,3,2
3,8,6
...
100,200,99.99
Then, to load the CSV file in R you could write:
> dataSample <- read.csv(file="C:/sample.csv", head=TRUE, sep=",")
Note that on *NIX machines you should replace the path to your file with something like "/home/username/path/to/sample.csv". To check the values, simply type:
> dataSample
From here, you can use the data to plot your graphs like in the tutorial.

Resources