How can I import thousands documents that are 'files'

How can I import thousands documents that are 'files' - r

I need to import a file folder with thousands of files. They are not csv, xlsx, txt files, they just say file under type(data type). I have tried multiple ways to import them as well as running R as an administrator.
I have tried different permutations of this code using different read.csv,delim etc but I am unable to import the files.
baseball <- read.csv('C:/Users/nfisc/Desktop/Spring 2021/CIS 576/Homework/Homework 5/rec.sport.baseball', stringsAsFactors = F)
Any help would be appreciated.

Can you be more specific about what these files are? Does it open into Excel? What kind of data does it represent?
Without an extension, it may be difficult to import.
If you right-click the files and go to "properties", what does it say under "Type of File"?

I know that this is late, but I figured out how to input text data. The original issue also was that the file was a rar file and not a zip file.
Import text data # Make V Corpus from a specified directory
baseball_messages <- tm::VCorpus(tm::DirSource("C:/Users/nfisc/Desktop/Spring 2021/CIS 576/Homework/Homework 5/usebaseball"))
hockey_messages <- tm::VCorpus(tm::DirSource("C:/Users/nfisc/Desktop/Spring 2021/CIS 576/Homework/Homework 5/usehockey"))

Related

Read excel file cannot find the path?

library("readxl")
my_data <- read_excel("GVA.xlsx")
my_data
however the console says the path does not exist.
How can I import excel/csv files and know that the file will always be found.
Why does this not work?
p.s.
I am new to R

Well, is 'GVA.xlsx' in your working directory? If not, R can't find it because you're not mapping to anything outside of your current directory. You can navigate to the file by clicking on: File > Import Dataset > From Excel. Browse to your file and set any required import options. R will actually create the code to map to the file in question.

Import information from .doc files into R

I've got a folder full of .doc files and I want to merge them all into R to create a dataframe with filename as one column and content as another column (which would include all content from the .doc file.
Is this even possible? If so, could you provide me with an overview of how to go about doing this?
I tried starting out by converting all the files to .txt format using readtext() using the following code:
DATA_DIR <- system.file("C:/Users/MyFiles/Desktop")
readtext(paste0(DATA_DIR, "/files/*.doc"))
I also tried:
setwd("C:/Users/My Files/Desktop")
I couldn't get either to work (output from R was Error in list_files(file, ignore_missing, TRUE, verbosity) : File '' does not exist.) but I'm not sure if this is necessary for what I want to do.
Sorry that this is quite vague; I guess I want to know first and foremost if what I want to do can be done. Many thanks!

Importing to R an Excel file saved as web-page

I would like to open an Excel file saved as webpage using R and I keep getting error messages.
The desired steps are:
1) Upload the file into RStudio
2) Change the format into a data frame / tibble
3) Save the file as an xls
The message I get when I open the file in Excel is that the file format (excel webpage format) and extension format (xls) differ. I have tried the steps in this answer, but to no avail. I would be grateful for any help!

I don't expect anybody will be able to give you a definitive answer without a link to the actual file. The complication is that many services will write files as .xls or .xlsx without them being valid Excel format. This is done because Excel is so common and some non-technical people feel more confident working with Excel files than a csv file. Now, the files will have been stored in a format that Excel can deal with (hence your warning message), but R's libraries are more strict and don't see the actual file type they were expecting, so they fail.
That said, the below steps worked for me when I last encountered this problem. A service was outputting .xls files which were actually just HTML tables saved with an .xls file extension.
1) Download the file to work with it locally. You can script this of course, e.g. with download.file(), but this step helps eliminate other errors involved in working directly with a webpage or connection.
2) Load the full file with readHTMLTable() from the XML package
library(XML)
dTemp = readHTMLTable([filename], stringsAsFactors = FALSE)
This will return a list of dataframes. Your result set will quite likely be the second element or later (see ?readHTMLTable for an example with explanation). You will probably need to experiment here and explore the list structure as it may have nested lists.
3) Extract the relevant list element, e.g.
df = dTemp[2]
You also mention writing out the final data frame as an xls file which suggests you want the old-style format. I would suggest the package WriteXLS for this purpose.

I seriously doubt Excel is 'saved as a web page'. I'm pretty sure the file just sits on a server and all you have to do is go fetch it. Some kind of files (In particular Excel and h5) are binary rather than text files. This needs an added setting to warn R that it is a binary file and should be handled appropriately.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download.file(url=myurl, destfile="localcopy.xlsx", mode="wb")
or, for use downloader, and ty something like this.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download(myurl, destfile="localcopy.csv", mode="wb")

Cannot import csv file into R

I'm trying to import the csv files from my working directory. There are 3 such files, but for some reason R insists on recognizing only one of them. I can't determine what the pattern is, and if the recognized file is moved out of the folder then nothing is recognized. Here is my code:
files = list.files(pattern="*\\.csv$")
Each of the files is for sure a csv file which I confirmed by inspecting the "Type" column in the windows folder navigator, and to be safe I also saved a copy as CSV and still had the same problem.
Is there an aspect to this I'm unaware of?
Thanks!

The issue turned out to be that the file extension for the file that worked was ".csv" and for the ones that didn't was ".CSV". I do not know how or why something like that can happen, but the pattern parameter of the list.files function is case sensitive.
Using the parameter setting ignore.case = TRUE solved this issue.

R, intercept a csv file to download from a link

i have a link that when open automatically starts a download to a csv file, i need to intercept this csv in R and download it without let the possibility to user to choose where to download the file.
csv is generated on demand and i need to get it

#ntrax, 'download.file' from the 'utils' package will do the trick.
i.e.
download.file(url,destfile)
for the many options that are available check ?download.file.

You can just use read.table to import the csv file as a data frame, for example:
fpe <- read.table("http://data.princeton.edu/wws509/datasets/effort.dat")
Is this what you mean?

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How can I import thousands documents that are 'files' - r

Can you be more specific about what these files are? Does it open into Excel? What kind of data does it represent? Without an extension, it may be difficult to import. If you right-click the files and go to "properties", what does it say under "Type of File"?

Related

Read excel file cannot find the path?

Import information from .doc files into R

Importing to R an Excel file saved as web-page

Cannot import csv file into R

R, intercept a csv file to download from a link

Categories

Resources