All,
I am using R and exifr to read in a jpeg with a ton of exif data entries (~150 different tags). I use a bunch of those to convert/change the values in the jpeg. After that, I would like to save the new values (in matrix form now) with the same exif data back to a jpeg. So, ideally, I would have two jpegs with different values, but everything else is the same. Is there a way to use exifr to write exif data back to a file? That is, read in a jpeg, manipulate it, and then save it back as jpeg with the exact same metadata?
I have thought about different approaches, but haven't really gotten far in either.
Maybe using the raster package and then exporting it as a geoTIFF, but that would obviously mean a change in file format.
The jpeg package loses the metadata while reading in the image.
Thanks y'all!
PS: There is a similar question about writing metadata, but it's almost 2 years old and only refers to GPS tags.
Related
So I'm working with a network dataset from Stanford's SNAP Datasets and "SNAP" has wrappers for Python and C++ but not R - however, the data is still usable since I believe it's a mix of CSV files.
I can actually read in the .edges file and form an igraph object but want to read in the other files, get the attributes & add those attributes to the igraph object for analysis. I'm just confused on how to work with the .circles, .egofeat, .feat, and .featnames files since the documentation on the dataset is very scarce. Hoping someone has worked with the dataset in R or even another language and has any pointers to get started.
Thank you!
For a shiny app in a repository containing a single static data file, what is the optimal format for that flat file (and corresponding function to read that file) which minimises the read time for that flat file to a data.frame?
For example, suppose when a shiny app starts it reads an .RDS, but suppose that takes ~30 seconds and we wish to decrease that. Are there any methods of saving the file and using a function which can save time?
Here's what I know already:
I have been reading some speed comparison articles, but none seem to comprehensive benchmark all methods in the context of a shiny app (and possible cores/threading implications). Some offer sound advice like trying to load in less data
I notice languages like julia can sometimes be faster, but I'm not sure if reading a file using another language would help since it would have to be converted to an object R recognises, and presumably that process would take longer than simply reading as an R object initially
I have noticed identical files seem to be smaller when saved as .RDS compared to .csv, however, I'm not sure if file size necessarily has an effect on read time.
I work in an environment, where we heavily depend on Excel to do the statistical job. We have our own Excel workbooks that create reports, charts and compute the models. But sometimes the Excel is not enough, so we would like to use R to augment the data processing.
I am developing a fairly universal and low-level Excel workbook that is capable to convert our data structures stored in Excel workbook into R using rcom and RExcel macros. Because data are large, the process of porting them into R is lengthy (in terms of the time a used needs to wait after pressing F9 to recalculate the workbook), so I started to develop caching capabilities to my Excel workbook.
Caching is achieved by embedding an extra attribute to the saved object(s), that is a function that checks if the mtime of the Excel's workbook with the data structure did not change since the time the R object was created. Additionally the template supports saving the objects into disk, so next time it is not mandatory to use the workbook and the original Excel data structures in the first place, when doing calculations that involve mostly R.
Although for the most cases the user wouldn't care, but internally sometimes it is more natural to save the data into one R object (like data.frame), and sometimes it seems that saving a whole set o multiple R objects is more intuitive.
When saving a single R object, the saveRDS is more convenient, so I prefer it over save, which works only for multiple objects. (I know, that I can always render multiple objects into one by combining them in the list)
According to the manual for saveRDS the file generated by save has first 5 bytes equal to ASCII representation of RDXs\n. Is there any ready function to test that, or should I manually open the file asbinary, read the 5 bytes, trap a corner case if the file doesn't have even 5 bytes, close the file, etc.?
Identifying a potential bug here. When calling writeRaster overwrite=TRUE, the new raster values remain unchanged. I originally wrote the wrong raster object, then corrected the code, and wrote a new raster to the same file name. The values in the attribute table of the written file are the same as the original, even though the raster object I am writing has the correct attributes when viewed in R.
Workaround was to give the new raster a different name (or manually delete the old).
R 3.0.0, Windows 7 64-b
Apologies to Brian, with whom I share our modeling workstation. This was my post.
Josh O'Brien- Looks like you were right, there was something locking the write-protection. I think ArcCatalog was locking it up.
This tool has performed as expected many times since this incident.
I found the same issue.
I confirm that if you have ArcMap open, the R function overwrite=TRUE doesn't work.
By the way, without any warning message.
Hope this help other R-user in managing raster files.
So I've been trying to read this particular .mat file into R. I don't know too much about matlab, but I know enough that the R.matlab package can only read uncompressed data into R, and to save it as uncompressed I need to save it as such in matlab by using
save new.mat -v6.
Okay, so I did that, but when I used readMat("new.mat") in R, it just got stuck loading that forever. I also tried using package hdf5 via:
> hdf5load("new.mat", load=FALSE)->g
Error in hdf5load("new.mat", load = FALSE) :
can't handle hdf type 201331051
I'm not sure what this problem could be, but if anyone wants to try to figure this out the file is located at http://dibernardo.tigem.it/MANTRA/MANTRA_online/Matlab_Code%26Data.html and is called inventory.mat (the first file).
Thanks for your help!
This particular file has one object, inventory, which is a struct object, with a lot of different things inside of it. Some are cell arrays, others are vectors of doubles or logicals, and a couple are matrices of doubles. It looks like R.matlab does not like cells arrays within structs, but I'm not sure what's causing issues for R to load this. For reasons like this, I'd generally recommend avoiding mapping structs in Matlab to objects in R. It is similar to a list, and this one can be transformed to a list, but it's not always a good idea.
I recommend creating a new file, one for each object, e.g. ids = inventory.instance_ids and save each object to either a separate .mat file, or save all of them, except for the inventory object, into 1 file. Even better is to go to text, e.g via csvwrite, so that you can see what's being created.
I realize that's going around use of a Matlab to R reader, but having things in a common, universal format is much more useful for reproducibility than to acquire a bunch of different readers for a proprietary format.
Alternatively, you can pass objects in memory via R.matlab, or this set of functions + the R/DCOM interface (on Windows).
Although this doesn't address how to use R.matlab, I've done a lot of transferring of data between R and Matlab, in both directions, and I find that it's best to avoid .mat files (and, similarly, .rdat files). I like to pass objects in memory, so that I can inspect them on each side, or via standard text files. Dealing with application specific file formats, especially those that change quite a bit and are inefficient (I'm looking at you MathWorks), is not a good use of time. I appreciate the folks who work on readers, but having a lot more control over the data structures used in the target language is very much worth the space overhead of using a simple output file format. In-memory data transfer is very nice because you can interface programs, but that may be a distraction if your only goal is to move data.
Have you run the examples in http://cran.r-project.org/web/packages/R.matlab/R.matlab.pdf on pages 22 to 24? That will test your ability to read from versions 4 and 5. I'm not sure that R cannot read compressed files. There is an Rcompresssion package in Omegahat.