Fill input tag of an html in python - web-scraping

So I have tried web scraping a website and it has a field where you can write ( a navigation bar of some sort )
Whenever I am writing something there it creates a dropdown of things related to what I wrote ( things that contain what I wrote )
What I'm trying to do is essentially use requests.post from requests library in python in order to fill a value inside it, afterwards, I want it to grab whatever the dropdown showed.
I've had a few problems while doing it:
The dropdown disappears whenever you click somewhere else on the website so it does create temporary HTML tags of the list temporarily.
I couldn't find a way to actually post something inside the navigation bar.
A great example I've found on the web is inside FUTWIZ which does exactly what I described above, Whenever I try with F12 I see it creates some HTML description, is there a way to grab the HTML After the value is put inside the actual navigation bar?
EDIT
This is the code I've tried:
import requests
from bs4 import BeautifulSoup
urls = "https://www.futwiz.com/en/"
requst = requests.get(urls)
bs4Out = BeautifulSoup(requst.text, "html.parser")
poster = requests.post(urls, data={"form-control": "Messi"})
print(poster.text)
Now, I know the data in requests.post only puts it as a query but I can't really figure out how to fill the header
This is the link to FUTWIZ, it has the navigation bar which is the thing I'm trying to work with?
https://www.futwiz.com/en/

Related

Can't reach inner document in object tag in React

I'm trying to make a preview of a pdf document using only the object or iframe tag.
My goal is to remove the toolbar and the scrollbar and correctly move the file to have a good looking image of my first pdf page. For such needs I need to apply css to some part of my document like "#outerContainer #mainContainer div.toolbar".
The problem is that I can't reach those components and I don't understand why.
I have tried to apply directly new class to the component but nothing happened or to switch to iframe tag.
I also went through different topics like this one : Access <object> data with JavaScript and I saw that my contentDocument is undefined
Here is a link to a sandbox demonstrating my attempt to access the inner document:
https://codesandbox.io/s/keen-cloud-z9gtf?file=/src/PDFObject.tsx
Thanks for your future help !

webscraping java by hovering mouse. Dynamic data not displayed after scraping

I want to scrape data from a graph of a particular website.
This information in graph is available only if you hover mouse on the graph.But after I scrape, I am unable to see the data in output even though it is visible under 'Inspect Element'.
I have tried to scrape using JSoup but when I scrape the data, the data that changes by hovering mouse is not displayed.
How can I do this?
Below is the information which I have to scrape. I have to scrape the dynamically changing value '184'.
The value 184 is dynamically changing when you hover mouse on graph wit h RGB values displyaed in the above line. Even these RGB values changes by hovering mouse on graph.
After scraping, the output of document by Jsoup looks like the below:
The number 184 and rgb values are not appeared. How are these fields disappeared in output? Does this not appear because it is a dynamic data by mouse hovering?
I actually have to scrape information from the following graph which displays 'Carbon Intensity' value from the graph "Carbon Intensity in the last 24 hours" only by hovering mouse on it.
I am stuck with this problem since two days and has not found any helpful solution. I am using Jsoup on linux.Could some one suggest me how can I do this.
Thanks in advance!
To do that you should use Selenium and add it to Maven if you are using it, or to whatever dependency manager you are using. Once you do that you need to add this .exe (https://github.com/mozilla/geckodriver/releases) to your project folder to get the Firefox support for Selenium, you can also use Google Chrome following this tutorial (https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver).
You have a lot of tutorials on how to force the JS of a web page to get its content, but it could be something like this, to set the mouse over an item from the HTML:
WebDriver webDriver = new FirefoxDriver();
JavascriptExecutor js = (JavascriptExecutor)webDriver;
webDriver.get(URL); // You have to place the URL you are crawling here
Actions action = new Actions(webDriver);
WebElement webElement = webDriver.findElement(By.id("country-emission-rect));
// using By you have a lot more options to select HTML content, I guess you want to place the mouse over that item in particular, but you can change if it it's another one
action.moveToElement(webElement).perform();
WebDriverWait webDriverWait = new WebDriverWait(webDriver, 15); // wait max 15 seconds
// wait until the element with class name: "country-emission-intensity" is loaded
webDriverWait.until(ExpectedConditions.visibilityOfElementLocated(By.className("country-emission-intensity")));
// get the HTML generate after the mouse over that now has the text you want to get
String fullHtml = webDriver.getPageSource();
webDriver.quit();
If you want to keep using JSOUP instead of Selenium for the scrapping you can now do:
Document document = Jsoup.parse(fullHtml);
Remember to place the .exe in your project folder and to install correctly all the Selenium dependencies (enabling auto-import if you are using Maven).
Hope it helped you! If you need anything else feel free to ask!

How to scrape data in a page with jquery button click using HtmlAgility pack

I am trying to scrape data from a page with similar content(Shopping website) using HtmlAgility pack.
There is a button to load more items designed of tag. On click it loads more items on same page.
If it is designed using tag then I will get the next items using the href attribute URL in tag and also I will be loading new page for the new next items, So no problem.
But here no new URL and items loaded on same page.
So is there any way to get this functionality implemented? How to trigger that load more button to get more items?
HtmlAgilityPack is an html parser alone, it knows only to parse a static html document. what you want may be accomplished using selenium web driver.
Another possibility is - if the number of item load actions is so that you can complete the loading manually - do so and save the resulting html locally, and only afterwards use HtmlAgiliyPack to parse the static html you stored locally (instead of parsing the http response).
Share the link of the site you are talking about so I can add some code snippets to exemplify.

My website doesn't print correctly

I've created a local website to be used as a database where I work. It's made using Rails 3.2, and TwitterBootstrap for most of the CSS. The problem is, if I want to print the view on my browser by going to file-print, or tools-print, or rightclick-print (depending on the browser), the print preview looks completely different than the actual browser page does. For example, I have my "Index" view loaded with a nice table that has 6-8 columns, but when I print almost all of the information in the columns disapears and random code pops up in random places, including the URL, some SQL, and some folder paths where the links belong. I just want to be able to print the page as it looks in the browser (without having to do a print screen every time).
As it turns out, I need to set up a CSS specially for printing. Then inserting the css link into my HTML header with media="print".
As shown here: http://www.w3schools.com/css/css_mediatypes.asp

How to create a bookmarklet for creating a screen scraping?

How to create a bookmarklet like this one: http://www.vimeo.com/1626505
I want to create one the same, where to start? i want to know the work flow of how this one is working to build my own.
Thanks
A bookmarklet is just a javascript program written on a single line of code replacing the usual location attribute (http://www.somestuffhere.com) on a bookmark.
To build your own bookmarklet, I suggest you to use Firebug :
- type your code inside firebug and execute it until what you want to do is working,
- then, remove all new lines in order to have a big one line piece of code,
- create a new bookmark in your browser and, in the location field, write javascript: and copy-paste your single line of code.
You can try a simple bookmarklet by typing that directly in your browser location bar : javascript: alert('this is a very simple bookmarklet'); then type enter to execute it.
Here is a handy bookmarket builder I have sometimes used. It can squash many lines of javascript into one line that can be set as the 'target' of a bookmark
(there may very well be better ones out there than this, but its done the job well for me)

Resources