Extract table data from PDF that is formatted as picture [closed] - r

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am trying to extract the data in the tables that start on p.52 of this document (a report from FAA).
The problem is that the tables are included as pictures. Any chance I can get some pointers on how to do that without doing it manually?
I have tried converting it to text using Adobe's OCR function, and I have also tried using the extract_tables function in R's tabulized package.
I could of course do this manually, but it would be good to know if there is a more efficient way of doing it.

It's possible, however its accuracy depends on the image. I always use grayscale images. Here an example of available tools. In your case, I'd suggest you take some screenshots of the tables and use the OCRFeeder to compare the results from GOCR and Tesseract.
sudo apt-get install gocr tesseract-ocr ocrfeeder
ocrfeeder -i image.jpg
After some manual checks, you can import this file in LibreOffice Calc, save it as 'csv', and import in R.

Related

Is there an R online compiler that enables to run code line by line? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 9 months ago.
Improve this question
I'm looking for an R online compiler, such as myCompiler, that enables to run code line by line.
Like what one gets in RStudio when pressing the Run button or Ctrl+Enter
Some Jupyter variants let you run a single line or a selection, Google Colab ( https://colab.to/r for new R Notebook ) or Kaggle ( https://www.kaggle.com/docs/notebooks ) for example, both use Ctrl+Shift+Enter
Bit different take would perhaps be RStudio in Binder.
Example - http://mybinder.org/v2/gh/binder-examples/r/master?urlpath=rstudio
More details - https://github.com/binder-examples/r
Have you tried RStudio cloud? https://rstudio.cloud/

Is it possible to create an Excel file with R? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
General question: is there an R package out there that creates an Excel file? Or saves data frames as excel files? Or is it only possible to write files that already exist in a specific directory? If it is possible to create new Excel files, is there also a possibility to create multiple Excel sheets in that file?
Thank you for answering!
I believe the package openxlsx is the most popular package to do this.
Example:
library(openxlsx)
write.xlsx(iris, file = "writeXLSX1.xlsx")
And yes, you can also add multiple sheets. See a nice introduction here.

R Package for publication ready summary tables in descriptive statistics with an export option to LaTeX and other Editors? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
1) Descriptive statistics summary tables
I want to build my descriptive statistics summary tables on RStudio with the options to export and use these tables with LaTeX and Editors (Word, Excel, PowerPoint, LibreOffice).
2) Layout options
Moreover I would be great to have many layout options.
3) (Medical) journals
Furthermore is there the option to convert the tables with "one command" into the design of different (medical) journals?
I know I have a big request, but with my google search I can't solve the problem for many days. The first two are the most important issue for me right now, but I also hope that number three is also possible.
Thank you for your efforts and your helpful advices.
A quick answer:
1) & 2) You should take a look at knitr, kableExtra, rmarkdown and Hmisc R packages. officer can also do the job for MS office export.
3) If your papers and written in LaTeX, you should be able to use the LaTeX class provided by your journal, and not be worried by the final appearance of your tables.
As far as I know (I'm not really a rmarkdown or knitr user, and not at all and Rstudio user), Org-mode for Emacs allows more control on the final output and richer interactions with LaTeX.

Glicko-2 implementation in R, where to find? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am looking for an R implemention of the excellent Glicko-2 algorithm of Mark Glickman. Thusfar I found this one. Although this is a very nice piece of code I am particularly looking for a code that is able to deal with large data frames with match scores (meaning that it is capable of ranking all the players in the data frame in one go). A bit like the way the PlayerRatings package does the trick with e.g. Elo, Glicko. Unfortenately this package doesn't haven an implementation of the Glicko-2 algorithm.
Does anyone have an idea?
Glicko2 and few other algorithms are available in R package sport. Possible for two-player and multi-player matchups. Available on cran and github. Vignette included, standarized syntax, supported by C++.
Quick snippet
# install.packages("sport")
library(sport)
glicko2 <- glicko2_run(formula = rank|id ~ rider, data = gpheats)
# computation results
print(glicko2)
summary(glicko2)
tail(glicko2$r)
tail(glicko2$pairs)
If you had noticed the fine print at the bottom of Mark Glickman's page you would have seen (in tiny text admittedly)
PlayerRatings, an R package implementation of Glicko, as well as a
few other rating systems
with the link being: https://cran.r-project.org/web/packages/PlayerRatings/

How to interact with R from VBA? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Suppose I have VBA macro in Excel which does some calculation. And I would like to do a part of this calculation in R, but in program mode. Say, at some moment Excel macro has a vector and it needs to find its mean by mean function in R. How can I call R from VBA, transmit a vector to R, initiate the calculation in R and get back the result in VBA? Thanks.
There is a plugin RExcel, but I found quite horrible to use it (and you kinda have to pay for it).
The easiest and most general but hacky way to perform your interaction is the following:
1) Save your array/matrix/vector in csv in a folder
2) write your R code in a file that read the csv and write the result in csv
3) Call the R script from VBA with the VBA Shell function (Rscript scriptName.R)
4) Import the result back to excel/VBA.
This method has the advantage that you are separating the computational logic with the formatting from VBA.
You could also call the R code directly within VBA with -e option from R but this is strongly unadvised.
Hope it helps!
BTW: it works with all the other program (Python/LaTeX/Matlab).

Resources