Can I recreate a jpeg from a vector with class "raw"? - r

I've been able to use an API to download a file to R...sort of. I download the file and receive a vector of class "raw", but what I would like to do is write that vector out to a file in a manner that recreates the file as it was uploaded to my site.
In the sample vector below, I have a jpeg that I would like to save to a file. However, the solution needs to be more arbitrary, as pretty much any type of file could be downloaded via the API.
The vector I'm working with is rather large, so I'm linking to it.
I suppose you could use whatever file name you want, but the file name as downloaded from the site can be accessed using
gsub("\"", "", attributes(file)$'Content-Type'[2])
My initial thought was to convert the raw to bits using rawToBits but I can't seem to recreate the jpeg from there. Any tips or suggestions?

Figured it out on trial and error.
writeBin(as.vector(file),
gsub("\"", "", attributes(file)$'Content-Type'[2]),
useBytes=TRUE)

Related

Why does R raster::writeRaster() generate a pic which can't be shown in Win10?

I read my hyperspectral (.raw) file and combine three bands to "gai_out_r" Then I output as following:
writeRaster(gai_out_r,filepath,format="GTiff")
finally I got gai_out_r.tif
But, why Win10 can't display this small tif as the pic that I output the same way from envi--save image as--tif
Two tiffs are displayed by Win10 as following:
Default windows image viewing applications doesn't support Hyperspectral Images-since you are just reading and combining 3 bands from your .raw file, the resulting image will be a hyperspectral image.You need to have separate dedicated softwares to view hypercubes or can view it using spectral-python also.
In sPy, using envi.save_image , will save it as a ENVI type file only. To save it as an rgb image file(readable in windows OS) we need to use other methods.
You are using writeRaster to write to a GTiff (GeoTiff) format file. To write to a standard tif file you can use the tiff method. With writeRaster you could also write to a PNG instead
writeRaster(gai_out_r, "gai.png")
Cause of the issue:
I had a similar issue and recognised that the exported .tif files had a different bit depth than .tif images I could open. The images could not be displayed using common applications, although they were not broken and I could open them in R or QGIS. Hence, the values were coded in a way Windows would not expect.
When you type ?writeRaster() you will find that there are various options when it comes to saving a .tif (or other format) using the raster::writeRaster() function. Click on the links therein to get to the dataType {raster} help site and you'll find there are various integer types to choose from.
Solution (write a Windows-readable GeoTIFF):
I set the following options to make the resulting .tif file readable (note the datatype option):
writeRaster(raster, filename = "/path/to/your/output.tif",
format = "GTiff", datatype = "INT1U")
Note:
I realised your post is from 2 and a half years ago... Anyways, may this answer help others who encounter this problem.

Import information from .doc files into R

I've got a folder full of .doc files and I want to merge them all into R to create a dataframe with filename as one column and content as another column (which would include all content from the .doc file.
Is this even possible? If so, could you provide me with an overview of how to go about doing this?
I tried starting out by converting all the files to .txt format using readtext() using the following code:
DATA_DIR <- system.file("C:/Users/MyFiles/Desktop")
readtext(paste0(DATA_DIR, "/files/*.doc"))
I also tried:
setwd("C:/Users/My Files/Desktop")
I couldn't get either to work (output from R was Error in list_files(file, ignore_missing, TRUE, verbosity) : File '' does not exist.) but I'm not sure if this is necessary for what I want to do.
Sorry that this is quite vague; I guess I want to know first and foremost if what I want to do can be done. Many thanks!

Importing to R an Excel file saved as web-page

I would like to open an Excel file saved as webpage using R and I keep getting error messages.
The desired steps are:
1) Upload the file into RStudio
2) Change the format into a data frame / tibble
3) Save the file as an xls
The message I get when I open the file in Excel is that the file format (excel webpage format) and extension format (xls) differ. I have tried the steps in this answer, but to no avail. I would be grateful for any help!
I don't expect anybody will be able to give you a definitive answer without a link to the actual file. The complication is that many services will write files as .xls or .xlsx without them being valid Excel format. This is done because Excel is so common and some non-technical people feel more confident working with Excel files than a csv file. Now, the files will have been stored in a format that Excel can deal with (hence your warning message), but R's libraries are more strict and don't see the actual file type they were expecting, so they fail.
That said, the below steps worked for me when I last encountered this problem. A service was outputting .xls files which were actually just HTML tables saved with an .xls file extension.
1) Download the file to work with it locally. You can script this of course, e.g. with download.file(), but this step helps eliminate other errors involved in working directly with a webpage or connection.
2) Load the full file with readHTMLTable() from the XML package
library(XML)
dTemp = readHTMLTable([filename], stringsAsFactors = FALSE)
This will return a list of dataframes. Your result set will quite likely be the second element or later (see ?readHTMLTable for an example with explanation). You will probably need to experiment here and explore the list structure as it may have nested lists.
3) Extract the relevant list element, e.g.
df = dTemp[2]
You also mention writing out the final data frame as an xls file which suggests you want the old-style format. I would suggest the package WriteXLS for this purpose.
I seriously doubt Excel is 'saved as a web page'. I'm pretty sure the file just sits on a server and all you have to do is go fetch it. Some kind of files (In particular Excel and h5) are binary rather than text files. This needs an added setting to warn R that it is a binary file and should be handled appropriately.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download.file(url=myurl, destfile="localcopy.xlsx", mode="wb")
or, for use downloader, and ty something like this.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download(myurl, destfile="localcopy.csv", mode="wb")

PDF File Import R

I have multiple .pdf-files (stored in a local folder), that contain text. I would like to import the .pdf-files (i.e., the texts) in R. I applied the function 'read_dir' (R package: [textreadr][1])
library ("textreadr")
Data <- read_dir("<MY PATH>")
The function works well. BUT. For several files, that include special characters (i.e., letters) in their names (such as 'ć'; e.g., 'filenameć.pdf'), the function did not work (error message: 'The following files failed to read in and were removed:' …).
What can I do?
I tried to rename the files via R (did not work (probably due to the same reasons)). That might be a workaround.
I did not want to rename the files manually :)
Follow-Up (only for experts):
For several files, I got one of the following error messages (and I have no idea why):
PDF error: Mismatch between font type and embedded font file
or
PDF error: Couldn't find trailer dictionary
Any suggestions or hints how to solve this issue?
Likely the issue concerns the encoding of the file names. If you absolutely want to use R to rename the files for you, the function you want to use is iconv, determine the encoding of the file names and then convert them to utf-8.
However, a much better system would imply renaming them using bash from command line. Can you provide a more complete set of examples?

library twitteR issues when reading an external file

I have no internet connection and I just want to make a program using the library twitteR of R. For that purpose I have downloaded the file rdmTweets.RData that it is supposed to hold a collection of twitters. That file is available in: http://www.rdatamining.com/data
I have try to read that file using:
rdmTweets<-userTimeline("rdmTweets.RData",n=200)
also converting directly to a data frame:
df<-do.call("rbind",lapply("rdmTweets.RData",as.data.frame)
but, with no results at all. I mean it does not show any information of the twitters. I tried to read it like a file with:
rdm<-file("rdmTweets.RData","r")
lines<-readLines(rdm)
also with no results. It seems the only way that I can access that file is when I have:
rdmTweets<-userTimeline("rdatamining",n=200)
but that means to have an active internet connection. So the question that I have is how I can read that file in a way that I can get its contents like if I use userTimeline?
Thanks
To read a RData file, you need to use load().
Run the code below. The 1st line loads a data object rdmTweets from file, and the 2nd converts it into a data frame. You also need to have package twitteR installed.
load("rdmTweets.RData")
df <- do.call("rbind", lapply(rdmTweets, as.data.frame))

Resources