I have a large file that contains two column data X,Y. I have got a round 20 million records. I want to graph the file in ggplot or even in normal plot (scattered plot). I used to read the file in R by using read command and store the whole data in a data frame, however, with the current size R can't read the file. I managed to plot the data in gnuplot by using every command to reduce the size. But I'd like to graph the file with R.
How to read the large file and plot it. I think reading the file line by line will not help because I want to graph the values. I'm not aware of any command like èvery in R.
Thank you for any suggestions.
Related
Firstly, I am very, very new to Python. I am trying to come close to a product I make in Excel charts by plotting X and Y. I want to be able to export the data from Excel to a text file and the read into Python to produce the chart. In Excel I have the line segments, defined by X and Y values, separated by empty cells in columns and the other segments in other columns. For right now I am just exporting a single column with the objective of having more columns. So if the image shows up in this post you will see that the different line segments are all connected by straight lines, sort of like Etch-a-Sketch. I got errors when I had blank lines in the text file where I had empty cells in the Excel data. Any guidance would be greatly appreciated.
Attempted chart
I am attempting to generate heatmaps from a data file I've been generating. I could re-format the data however I like, but for the time being, let's say it's a list of 16 numbers that I'd like put into a 4x4 heatmap. However, I have many sets of these 16 numbers sequentially in the same file, and hope to eventually animate them together (something I am more comfortable with, and will come later)
However, for the time being, I cannot find a way to get GnuPlot to select only certain sections of the data file while still plotting properly. A loose example of what I would've thought it WOULD look like:
plot "SortedData.txt" every ::0::15 w image
or:
splot "SortedData.txt" every ::0::15
Both give me errors and fail to render. I could label the data values with an x-y coordinate if needed, but the task is fairly repetitive: I just want the first 16 points mapped, and then the ability to iterate once and have the next 16 points mapped on their own, etc. Stripping the data file to just the first 16 points and removing the 'every' command confirms that it can plot, but trying to specify even just the first 16 manually messes it up.
Can anyone point me in the right direction? The "every" command has been fairly nebulous and seems largely incompatible with images / 3-D data. Also, I am running on Windows, so piping in linux commands is something I'd like to avoid.
Thanks!
edit: Here is 4 example frames of the data. Reformatting it to, say, present as a matrix or label with pixel addresses are all something I can do if needed.
0.000000 -49.314654 -44.425234 -46.613870 -48.494232 -46.884806 -46.553071 -46.555624 -43.755972 -47.817691 -42.481637 -46.819782 -44.347586 -49.487077 -47.291832 -45.140636 -47.945934
0.839906 -49.325396 -44.425493 -46.613214 -48.501283 -46.887236 -46.550858 -46.555285 -43.752786 -47.814706 -42.453793 -46.814333 -44.329492 -49.493501 -47.289394 -45.133555 -47.944045
1.679721 -49.336151 -44.425787 -46.612573 -48.508348 -46.889684 -46.548645 -46.554958 -43.749626 -47.811707 -42.425757 -46.808866 -44.311344 -49.499930 -47.286951 -45.126476 -47.942155
2.519466 -49.346920 -44.426117 -46.611946 -48.515427 -46.892152 -46.546431 -46.554641 -43.746492 -47.808695 -42.397525 -46.803382 -44.293140 -49.506365 -47.284501 -45.119398 -47.940264
It seems that each line in your data file has 17 elements. I assume that the first column is not part of your image data. I would format the remaining 16 values as a 4x4 matrix, with each frame separated by two blank lines:
-49.314654 -44.425234 -46.613870 -48.494232
-46.884806 -46.553071 -46.555624 -43.755972
-47.817691 -42.481637 -46.819782 -44.347586
-49.487077 -47.291832 -45.140636 -47.945934
-49.325396 -44.425493 -46.613214 -48.501283
-46.887236 -46.550858 -46.555285 -43.752786
-47.814706 -42.453793 -46.814333 -44.329492
-49.493501 -47.289394 -45.133555 -47.944045
-49.336151 -44.425787 -46.612573 -48.508348
-46.889684 -46.548645 -46.554958 -43.749626
-47.811707 -42.425757 -46.808866 -44.311344
-49.499930 -47.286951 -45.126476 -47.942155
-49.346920 -44.426117 -46.611946 -48.515427
-46.892152 -46.546431 -46.554641 -43.746492
-47.808695 -42.397525 -46.803382 -44.293140
-49.506365 -47.284501 -45.119398 -47.940264
You can then visualize each frame with the command
plot "data.dat" index FRAME matrix w image
where FRAME is 0, 1, 2 or 3.
Hello everybody out there using R,
When putting multiple plots with thousands of data points into a single PDF file, this file can get huge and take a long time to open.
The following post describes exactly the same problem in Matplotlib, as well as a nice fix for it:
Matplotlib: multipage PDF with rasterized plots
Particularly nice about it is, that it only rasterizes the points without rasterizing the labels.
http://www.astrobetter.com/blog/2014/01/17/slim-down-your-bloated-graphics/ contains a nice example of it.
I am now looking for a similar solution in R.
I have to write a report about a model that I have built using Netlogo. I have made many plots of model's variables and I'd like to extract them and put them in my report to show how these variables vary upon the time. These variables represent what the change of some parameters imply. So, I'd like to obtain a better plot than netlogo's one, because netlogo's plots haven't got enumerated axis and I'd like plots with enumerated axis.
It would be amazing to put the plot in a word document or in a power point document
See the NetLogo dictionary entry for export-all-plots. The easiest way to access this is the menu File > Export > Export All Plots ... and then choose a folder/directory to store the file and a file name. You will then get a file in csv format that you can open in your graphing package (eg R, Excel).
I'm doing a density compare in R using the sm package (sm.density.compare). Is there anyway I can get a mathematical description of the graph or at least a table with number of points rather than a plot back? I would like to plot the resulting graphs in a different application, but need the data to do so.
Thanks a lot for the help,
culicidae