How to plot graph when interval and data is given - graph

Suppose i have obtained some data which is like this :-
Size-Range Percentage
[1-3] 2%
[3-8] 6%
[8-20] 10%
[20-50] 30%
[50-100] 80%
[100-200] 99.99%
Suppose i run an algorithm with many data files and i got this output.
1st column shows the time of algorithm and 2nd column shows the percentile of data processed.
I just want to plot this data.
I want to draw some graph.
Please suggest me how can i do this using gnuplot or any other tool.

You can draw a histogram or a barplot in R (more details about it here). For a good tutorial in this direction, please see Producing Simple Graphs with R.
For example, suppose you have the data in a CSV file which looks like this:
Lo,Hi,Percentage
1,3,2
3,8,6
...
100,200,99.99
Then, to load the CSV file in R you could write:
> dataSample <- read.csv(file="C:/sample.csv", head=TRUE, sep=",")
Note that on *NIX machines you should replace the path to your file with something like "/home/username/path/to/sample.csv". To check the values, simply type:
> dataSample
From here, you can use the data to plot your graphs like in the tutorial.

Related

Determine the proportion of the data information in r

Suppose i have a plot like the following:
I want to get the portion of the data where the majority (say 90%) of the data lay, for example, i want to isolate the plot into something like:
in which the points lay in the black frame contributes to (90%) of the data.
How can i do this in R?
Edited for comment:
What if i have the following plot:? the majority part probably start from 0.

How can I generate heatmaps from specific sections of data using GnuPlot? ('splot', 'every', 'using' incompatibilites etc.)

I am attempting to generate heatmaps from a data file I've been generating. I could re-format the data however I like, but for the time being, let's say it's a list of 16 numbers that I'd like put into a 4x4 heatmap. However, I have many sets of these 16 numbers sequentially in the same file, and hope to eventually animate them together (something I am more comfortable with, and will come later)
However, for the time being, I cannot find a way to get GnuPlot to select only certain sections of the data file while still plotting properly. A loose example of what I would've thought it WOULD look like:
plot "SortedData.txt" every ::0::15 w image
or:
splot "SortedData.txt" every ::0::15
Both give me errors and fail to render. I could label the data values with an x-y coordinate if needed, but the task is fairly repetitive: I just want the first 16 points mapped, and then the ability to iterate once and have the next 16 points mapped on their own, etc. Stripping the data file to just the first 16 points and removing the 'every' command confirms that it can plot, but trying to specify even just the first 16 manually messes it up.
Can anyone point me in the right direction? The "every" command has been fairly nebulous and seems largely incompatible with images / 3-D data. Also, I am running on Windows, so piping in linux commands is something I'd like to avoid.
Thanks!
edit: Here is 4 example frames of the data. Reformatting it to, say, present as a matrix or label with pixel addresses are all something I can do if needed.
0.000000 -49.314654 -44.425234 -46.613870 -48.494232 -46.884806 -46.553071 -46.555624 -43.755972 -47.817691 -42.481637 -46.819782 -44.347586 -49.487077 -47.291832 -45.140636 -47.945934
0.839906 -49.325396 -44.425493 -46.613214 -48.501283 -46.887236 -46.550858 -46.555285 -43.752786 -47.814706 -42.453793 -46.814333 -44.329492 -49.493501 -47.289394 -45.133555 -47.944045
1.679721 -49.336151 -44.425787 -46.612573 -48.508348 -46.889684 -46.548645 -46.554958 -43.749626 -47.811707 -42.425757 -46.808866 -44.311344 -49.499930 -47.286951 -45.126476 -47.942155
2.519466 -49.346920 -44.426117 -46.611946 -48.515427 -46.892152 -46.546431 -46.554641 -43.746492 -47.808695 -42.397525 -46.803382 -44.293140 -49.506365 -47.284501 -45.119398 -47.940264
It seems that each line in your data file has 17 elements. I assume that the first column is not part of your image data. I would format the remaining 16 values as a 4x4 matrix, with each frame separated by two blank lines:
-49.314654 -44.425234 -46.613870 -48.494232
-46.884806 -46.553071 -46.555624 -43.755972
-47.817691 -42.481637 -46.819782 -44.347586
-49.487077 -47.291832 -45.140636 -47.945934
-49.325396 -44.425493 -46.613214 -48.501283
-46.887236 -46.550858 -46.555285 -43.752786
-47.814706 -42.453793 -46.814333 -44.329492
-49.493501 -47.289394 -45.133555 -47.944045
-49.336151 -44.425787 -46.612573 -48.508348
-46.889684 -46.548645 -46.554958 -43.749626
-47.811707 -42.425757 -46.808866 -44.311344
-49.499930 -47.286951 -45.126476 -47.942155
-49.346920 -44.426117 -46.611946 -48.515427
-46.892152 -46.546431 -46.554641 -43.746492
-47.808695 -42.397525 -46.803382 -44.293140
-49.506365 -47.284501 -45.119398 -47.940264
You can then visualize each frame with the command
plot "data.dat" index FRAME matrix w image
where FRAME is 0, 1, 2 or 3.

How to create bins in a reliability diagram

I created a logistic/logit model with a binomial response variable using
model <- glm(response~predictor1+predictor2+...)
and then I used the predict function to create a new data frame
outcome <-data.frame(predict(model,newdata=IndependentDataSet,type="response"),as.numeric(as.character(Independent$ResponseVariable)))
names(outcome) <- c("Pr","Obs")
I can use one of the following functions
plot(verify(data$obs,data$pr),CI=TRUE)
attribute(verify(data$obs,data$pr))
to create a plot that looks like this
or
reliability.plot(verify(data$obs,data$pr))
from
library(verification)
to create a reliability diagram. I am wondering how I can separate the bins based on specific values. For example, the model that I am evaluating is based around a climatology of 19% (0.19) and I want there to be a bin at (1/3)*climatology, climatology, and go up by (2/3) of climatology for the proceeding bins. How can I do this?
Additionally, I have seen the bins represented as circles that are proportional in size to the percent of the data that is at that bin. Does anyone know how to make a more aesthetically pleasing reliability diagram in R? Any recommendations are welcome.
This is how I would like my diagrams to appear
The easiest could be using
trace("attribute.default",edit=TRUE)
or whichever other function.
In this way, you access the source code and edit it. These changes affect only the current R session.

R: getting data (instead of plot) back from sm.density.compare

I'm doing a density compare in R using the sm package (sm.density.compare). Is there anyway I can get a mathematical description of the graph or at least a table with number of points rather than a plot back? I would like to plot the resulting graphs in a different application, but need the data to do so.
Thanks a lot for the help,
culicidae

Plot Large data

I have a large file that contains two column data X,Y. I have got a round 20 million records. I want to graph the file in ggplot or even in normal plot (scattered plot). I used to read the file in R by using read command and store the whole data in a data frame, however, with the current size R can't read the file. I managed to plot the data in gnuplot by using every command to reduce the size. But I'd like to graph the file with R.
How to read the large file and plot it. I think reading the file line by line will not help because I want to graph the values. I'm not aware of any command like èvery in R.
Thank you for any suggestions.

Resources