I am trying to download protein sequences from Uniprot with the following code.
driver = webdriver.Chrome(driver_location)
#get website
driver.get('https://www.uniprot.org/uniprotkb/P19515/entry#sequences')
#stall to load webpage
time.sleep(5)
#scroll webpage
#driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
#create instance of button and click
button = driver.find_element_by_link_text("Copy sequence")
button.click()
Running the previous block of code returns the following error
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"link text","selector":"Copy sequence"}
Additionally, here is the css layout
enter image description here
I assume the problem is the button is either dynamic or hidden in some way that the webdriver cannot locate the button. I know there is a Uniport API and probably other more efficient ways to download protein sequences but for the sake of learning how can I modify my code and why isn't the button clickable?
Link text only works if the locator has a hyperlink, so in this case it wont work since its a button which copies the the text into the clipboard and doesn't a href tag.
however you can modify your code to
button = driver.find_element(By.CSS_SELECTOR, "button.button.primary.tertiary")
and it will perform the click operation.
link_text only works for a tags, not button tags. But if you use the SeleniumBase framework on GitHub, there's a special TAG:contains("TEXT") selector that you can use to click the button.
Here's a working script: (After installing seleniumbase using pip install seleniumbase, run the script with python or pytest)
from seleniumbase import BaseCase
class RecorderTest(BaseCase):
def test_recording(self):
self.open("https://www.uniprot.org/uniprotkb/P19515/entry#sequences")
self.click('button:contains("Copy sequence")')
self.sleep(3)
if __name__ == "__main__":
from pytest import main
main([__file__])
You can use the SeleniumBase Recorder to generate a script like that from manual browser actions. The last part with if __name__ == "__main__": lets you run the script with python instead of just pytest.
Related
Is it possible to interact with the RStudio application using R code inside it? I mean interactions like opening a file in a card, creating a tab for a new unsaved file or changing some text inside an opened tab.
I know a very similar thing can be obtained by just simply creating a new text file or changing its content with R but this way it doesn't interact anyway with the RStudio app itself.
Some context: I was thinking of a tool that could automate inserting some reprexes / snippets of code which could work as a line of code that, when run from a file, replaces itself with a block of code or make a new unsaved file tab and put some code inside it. And yes, I know a very similar thing can be achieved other ways (e.g. simply copying the intended code block into the clipboard) but I'm curious and exploring the possibilities.
Thanks to the link provided by Konrad Rudolph I managed to find the answer myself.
There is a package called rstudioapi built into the RStudio that allows many different functionalities and doesn't require using plugins or addins.
All the features can be found in the official manual.
Opening a new unsaved file tab with some code in it can be obtained by running:
rstudioapi::documentNew(
"example code here",
type = "r",
position = rstudioapi::document_position(0, 0),
execute = FALSE
)
Inserting code can be easily done with insertText(text = "") which inserts text at the current position of the text cursor.
Changing one line into another can be obtained with the combination of getActiveDocumentContext(), which returns among others the document content and the selection range which is basically equivalent to the cursor position. Then the changing itself can be done with modifyRange() respectively to the cursor position and/or the document content.
That allows many possibilities including for example some smoother automation of the workflow.
I am running Python notebooks on Jupyter Notebook server. I am using Python logging module for logging, currently wired up to log to stdout like any console application, do display logging messages in Jupyter Notebook output.
But the default stdout based logging output feels limited. There is so much more you can do with HTML output over plain text/ANSI output.
Are there any advanced Jupyter Notebook logging handlers and formatters that would understand that the output is HTML and adjust accordingly? E.g. offer richer formatting options with colors and font sizes and interactively explore logging message context parameters like Sentry allows one to do?
Never thought to try this before, but yes you can do this using IPython.display.display and a custom logging.Handler which calls it:
import logging
from IPython.display import display, HTML
class DisplayHandler(logging.Handler):
def emit(self, record):
message = self.format(record)
display(message)
This could be used to display anything the notebook can display, including HTML, Markdown, images???, audio???? (if you want your notebook to read your logs to you).
I combined this with a custom logging.Formatter that outputs an HTML object for passing to display(). It's not pretty but you can take the basic concept and improve it, or combine both classes into a single NotebookHTMLHandler class or something like that:
class HTMLFormatter(logging.Formatter):
level_colors = {
logging.DEBUG: 'lightblue',
logging.INFO: 'dodgerblue',
logging.WARNING: 'goldenrod',
logging.ERROR: 'crimson',
logging.CRITICAL: 'firebrick'
}
def __init__(self):
super().__init__(
'<span style="font-weight: bold; color: green">{asctime}</span> '
'[<span style="font-weight: bold; color: {levelcolor}">{levelname}</span>] '
'{message}',
style='{'
)
def format(self, record):
record.levelcolor = self.level_colors.get(record.levelno, 'black')
return HTML(super().format(record))
Put all together you can use it like this:
log = logging.getLogger()
handler = DisplayHandler()
handler.setFormatter(HTMLFormatter())
log.addHandler(handler)
log.setLevel(logging.DEBUG)
Here's an example of how it looks:
For the HTML version you could tidy this up further, if the HTML grows complicated enough, by combining with an HTML templating engine like Jinja, or add JavaScript/Widgets to make log messages expandable to show more of the log record context, per your idea. A full solution I think would be beyond the scope of this answer.
Assuming you know what type of formatted output you wish to print, you can use the IPython.core.display package.
For example, to print HTML-formatted output you could do something like this:
from IPython.core.display import HTML
HTML('link')
To print Markdown-formatted output you could do:
from IPython.core.display import Markdown
Markdown('# This will be an H1 title')
I'm not sure what exactly you mean by "exploring context parameters", so maybe an example here will clear things.
In our largest ML modeling pipeline notebook we need to delete a single input (code) cell (containing sensitive information which we cannot pass via other means when automating its execution).
The cell has been created (injected) by papermill.execute_notebook() executed in another notebook (controller) and has been auto-tagged with injected-parameters tag.
The solution (possibly not the only one?) is deleting the cell as soon as it gets executed.
If searching for a tag makes it extra difficult, than let's use solutions for just deleting the previous input cell (programmatically).
What did not work
Hiding the input cell is not good enough, as it would still get saved to the disk (this includes the report_only option in papermill's execute_notebook()). Also "converting" with nbconvert to HTML (which does allow to select cells for removal on the basis of their tags, as in this solution) would still preserve the original notebook with the encoded password inside.
I'm assuming you want to remove the cell cause it contains sensitive information (a password).
My first advice would be not to pass sensitive information in plain text. A (slightly) better and simple option would be to store the password in a environment variable, and read it from the notebook using os.environ.
Here's how to remove the cell:
Note: I wrote this code on the fly and didn't test it, might need small edits.
import nbformat
nb = nbformat.read('/path/to/your/notebook.ipynb')
index = None
# find index for the cell with the injected params
for i, c in enumerate(nb.cells):
cell_tags = c.metadata.get('tags')
if cell_tags:
if 'injected-parameters' in cell_tags:
index = i
# remove cell
if index is not None:
nb.cells.pop(index)
# save modified notebook
nbformat.write(nb, '/path/to/your/notebook.ipynb')
An answer seems to involve nbformat and it already exist on this site, but to a question asked in a such a technical language, that I think it is worth simplifying that question to my plain English version to help / allow others to discover it (I duly upvoted the other answer).
def perform_post_exec_cleanup(output_nb_name, tag_to_del='injected-parameters'):
import json
from traitlets.config import Config
from nbconvert import NotebookExporter
import nbformat
c = Config()
c.TagRemovePreprocessor.enabled=True # to enable the preprocessor
c.TagRemovePreprocessor.remove_cell_tags = [tag_to_del]
c.preprocessors = ['TagRemovePreprocessor'] # previously: c.NotebookExporter.preprocessors
nb_body, resources = NotebookExporter(config=c).from_filename(output_nb_name)
nbformat.write(nbformat.from_dict(json.loads(nb_body)), output_nb_name, 4)
Caveats
It is normally possible to do such notebook conversions / cell stripping in-place in the same notebook in which the stripping code is run. NOT in case of papermill - it will NOT work from within the output notebook, when its code execution is controlled using papermill's execute_notebook() function. It has to be run in the external (controller) notebook, after the function has finished or interrupted its execution. Because the output notebook has been incrementally saved to the disk during the process, if you want to make sure the injected-parameters cell does not get saved permantently, you need to run the above stripping code unconditionally, even if the papermill function failed, so put it in your finally section of try-except-finally.
[ based on: Run preprocessor using nbconvert as a library ]
I don't know if it's possible but I would like to execute Python code inside my Jupyter notebook when opening it.
I know that I can go to Cell >> Run All, but what I am looking for is a way to automatically do it.
1) If you have nbextensions installed, you can designate "initialization cells", which run when the notebook is run. You define them using the cell -> cell type options from the menu.
2) Javascript:
from IPython.display import display, Javascript
display(Javascript("Jupyter.notebook.execute_cells_below()"))
3) For full autonomy from cells you can always import your python code, replacing
cell boundries with function calls.
In Jupyter you have to run the code cell by cell,
another shortcut is SHift+Enter to run.
If you want to run all the codes in same time you can copy it to SPyder and run.
I have a problem in using the shortcut 'shift tab' in order to get more informations of the package or command I am typing in in a cell. I installed Jupyter notebook via anaconda very recently, I am using python 3.7 and Ubuntu 18.04.
Do you know how to fix this problem ? I googled a lot but could not find a solution.
Many thanks.
Let's say you wrote the below code and trying to get the signature/documentation of function read_csv() with Shift+Tab (It may not work some times)
Code:
import pandas as pd
pd.read_csv()
-> First type only below code
pd.read_csv? ## Execute this code with Shift+Enter
-> Now when you type pd.read_csv and type Shift+Tab, this will show u signature/documentation... This is just a workaround...
Follow two steps:
Step 1: Run that cell first (shift + enter)
Step 2: After running the cell, press the shift + tab.
It worked for me. I hope it will work for you too :)
In the Google Colab environment, if fixed it as follows:
Tools | Settings | Editor |uncheck Automatically trigger code inspection.
Then, Tab and Shift-Tab worked as expected.
I also faced a similar problem but can you confirm that you imported the library in the jupyter notebook and then were calling one of the methods of the library?
What I observed is that if the library wasn't imported into the notebook then the documentation wasn't also showing using Shift+Tab. Once I imported it, then the shortcut was working to show the documentation.
Scenario 1:
import numpy as np #Pressed Enter for next line
a=np.random.randint #Shift + Tab not working
Scenario 2:
import numpy as np #Shift + Enter
a=np.random.randint #Shift + Tab working
Just Run the tab of your code
Then bring your curser in the parenthesis and press shift + tab
press shift + tab + tab for more info
On Google colab:
clicking on the function after running,
move the cursor away and then
bring the mouse and hover it over causes it to pop up.
Note: colab Tools | Settings | Editor | check Automatically trigger code inspection (first setting)
Run again your line of importing the libraries. Now, having loaded your modules you should be able to see the command documentation.