I am trying to use pdf2image, but I am getting this error:
PDFPageCountError: Unable to get page count.
I/O Error: Couldn't open file 'C:\Users\user_name\Desktop\folder_name\folder2_name\folder3_name\007-084841-1 to 31 Dec'22': No error.
It is confusing as it doesn't give any error, it just says 'No error'
My code is:
doc = convert_from_path("C:\\Users\\user_name\\Desktop\\folder_name\\folder2_name\\folder3_name\\007-084841-1 to 31 Dec'22")
path, fileName = os.path.split("C:\\Users\\user_name\\Desktop\\folder_name\\folder2_name\\folder3_name\\007-084841-1 to 31 Dec'22")
fileBaseName, fileExtension = os.path.splitext(fileName)
for page_number, page_data in enumerate(doc):
txt = pytesseract.image_to_string(Image.fromarray(page_data)).encode("utf-8")
print("Page # {} - {}".format(str(page_number),txt))
Can anyone help me please?
I don't know what to try as the error message just says Unable to open...: No error
Related
I am trying to use the Scopus API for the first time. I have the API key and the institution token. However, I am still getting an error, when I try to use it in R on my Mac. Here is my code:
library(rscopus)
set_api_key(MY_KEY)
hdr=inst_token_header(MY_TOKEN)
key=get_api_key()
print(rscopus::get_api_key(), reveal=TRUE)
have_api_key()
auth_info = process_author_name(last_name="Muschelli", first_name="John", verbose=FALSE)
The error message is:
> library(rscopus)
>
> set_api_key(MY_KEY)
> hdr=inst_token_header(MY_TOKEN)
> key=get_api_key()
> print(rscopus::get_api_key(), reveal=TRUE)
[1] "MY_KEY"
> have_api_key()
[1] TRUE
>
> if (have_api_key()) {
+ auth = elsevier_authenticate(api_key=key)
+ }
HTTP specified is: https://api.elsevier.com/authenticate
Warning message:
In elsevier_authenticate(api_key = key) : Forbidden (HTTP 403).
> auth_info = process_author_name(last_name="Muschelli", first_name="John", verbose=FALSE)
$`service-error`
$`service-error`$status
$`service-error`$status$statusCode
[1] "AUTHENTICATION_ERROR"
$`service-error`$status$statusText
[1] "Invalid API Key: valid apikey credentials required."
Error in get_complete_author_info(...) : Service Error
I tried
if (have_api_key()) {
auth = elsevier_authenticate(api_key=key)
}
but I get the error:
HTTP specified is: https://api.elsevier.com/authenticate
Warning message:
In elsevier_authenticate(api_key = key) : Forbidden (HTTP 403).
I have tried using auth_token_header(MY_TOKEN) instead of inst_token_header(MY_TOKEN) but the code is still not working.
I have also taken the following step in my terminal:
export Elsevier_API=MY_KEY > ~/.bash_profile
source ~/.bash_profile
I am still getting the error. However, the combination of key and institution token work here: https://dev.elsevier.com/scopus.html
Can anyone please help me debug this issue?
Thank You!
I figured out the issue. So, the correct way to query would be to pass the headers argument as well:
auth_info = process_author_name(last_name="Muschelli", first_name="John", verbose=FALSE, headers=hdr)
And now the code will work! :)
I'm using dada 2 version ‘1.22.0’ on windows 10, i have list of compressed (.gz ) fastq files, when i use the function filterAndTrim i get this error message :
Error in add(bin) : record does not start with '#'
But when i want to see if i can read th .gz file with library(ShortRead) :
library(ShortRead)
fn <- "path/to/example.fastq.gz"
reads <- readFastq(fn)
it doesn't give any error message message
I don't understand why the function filterAndTrim give the error message
Error in add(bin) : record does not start with '#'
Do you have any solution?
im using R package httr to get a HTTP-Response for a specific link.
When trying to parse the content of the response im getting the Error:
Fehler in parse(text = script_content) : <text>:1:10: Unerwartete(s) '['
1: {"lines":[
Translated to enlgish it says something like this (sorry for my error messages being in German):
Error in parsing(text = script_content) : <text>1:10: Unexpected '['
1: {"lines":[
It seems as there is a problem with the format/encoding of the text. Here is my code:
script <-
GET(
url = "https://my_url.which_origin_is_not_important/my_script.R",
authenticate(username, pass)
)
script_content <- content(script, as = "text", encoding = "ISO-8859-1")
parsed_condent <- parse(text = script_content )
The value of script_content looks like this:
"{\"lines\":[{\"text\":\"################## FUNCTION ##################\"},{\"text\":\"\"},{\"text\":\"library(log4r)\"}],\"start\":0,\"size\":32,\"isLastPage\":true,\"limit\":500,\"nextPageStart\":null}"
Some more background to this operation: Im trying to source a code, which is currently inside of a private repository. I wrote the code myself i'm trying to source. I made sure, that the issue is not coming from within th code.
I got the solution from: Sourcing R files in a private github folder
Thanks for any advice!!
Everything was working great up to last Tuesday. Ran it again this weekend and today, getting the error:
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
embedded nul in string: '\037‹\b'
install.packages("devtools")
devtools::install_github(repo = "maksimhorowitz/nflscrapR")
library(nflscrapR)
pbp_2019 <- scrape_season_play_by_play(2019, weeks = 9)
I expected to get the data as always, but this error above always pops up.
Any ideas?
I just redownloaded nflscrapR. I had to add 'force = TRUE' to the 'install_github()' command to get it to actually redownload.
I want to call a simple DOM file
I tested with another links and it works, but with this url it's not working.
My code is:
$bnadatos = file_get_html("http://www.rofex.com.ar/cem/FyO.aspx");
foreach($bnadatos->find('[#id="ctl00_ContentPlaceHolder1_gvFyO"]') as $i){
echo "datos:";
echo $i->innertext;
}
Response is a blank page.
What's wrong?
i solved with
$arrContextOptions=array(
"ssl"=>array(
"verify_peer"=>false,
"verify_peer_name"=>false,
),
);
$response = file_get_html("https://www.rofex.com.ar/cem/FyO.aspx", false, stream_context_create($arrContextOptions));
foreach($response->find('[#id="ctl00_gvwDDF"]/tbody/tr[2]/td[2]') as $i){
echo $i->innertext;
}
thank you #maio290 for light my road
This is just a guess, but do you have your error reporting on?
Out of the box, this is not working with the simple-html-dom library:
Warning: file_get_contents(): SSL operation failed with code 1. OpenSSL Error messages: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed in /var/www/html/dom.php on line 83
Warning: file_get_contents(): Failed to enable crypto in /var/www/html/dom.php on line 83
Warning: file_get_contents(http://www.rofex.com.ar/cem/FyO.aspx): failed to open stream: operation failed in /var/www/html/dom.php on line 83
Fatal error: Call to a member function find() on boolean in /var/www/html/test.php on line 11
A fix for this can be found here - with that in place, I still get a blank page, which is due to a wrong answer (301 Moved Permanently) - for this to fix, you need to modify
'follow_location' => false
to
'follow_location' => true
so, now we get the proper site content - you can modify the selector to $html->find('#ctl00_ContentPlaceHolder1_gvFyO'); this will find all element which id=ctl00_ContentPlaceHolder1_gvFyO - see the documentation as reference.