I have an idea and want to see whether it is possible to implement. I want to parse a website (copart.com) that shows, daily, a different and large list of cars with the corresponding description for each car. Daily, I am tasked with going over each list (each containing hundreds of cars) and select each car that meets certain requirements (brand, year, etc). I want to know whether it is possible to create a tool that would parse these lists automatically and, in so doing, selects the cars that meet my criteria.
I was thinking something like website scrapers such as ParseHub, but I am not trying to extract data. I simply want a tool that goes over a website and automatically clicks the "select" button on each car that meets my criteria. This would save me enormous amounts of time daily. Thanks.
I think you can use selenium for this task. It automatically opens the web browser and you can locate the element with xPath and click on the select button. I've done that before for some home utility website.
Scrapy is a good tool designed for this. Depending on how the webpages are rendered, you may or may not need an additional tool like Selenium. Submit or "select" buttons are often just links that can be followed using HTML requests, without an additional browser emulation tool. If you could post some of the sample HTML we could give you more specifics.
Related
I am trying to pull pricing data from a website, but each time the page is loaded, thet class is regenerated to a different sequence of letters, and the price is showing instead of a number. Is there a technique that I can use to bypass this in any way? Thanks! Here is the line of html as how it appears when I inspect the element:
<div class="zlgJQq">$</div>
<div class="qFwqmC hkVukg2 njGalW"> </div>
Your help would be much appreciated!
Perhaps that website is actively discouraging you from scraping their data. That would explain the apparently random class names. You might want to read their terms of use to be sure that it's OK to scrape their site.
However, if the raw HTML does not contain the price data but it is visible when the page is rendered, then it's likely that Javascript is being used to insert the prices after the page has loaded. You could try enabling the developer tools in your browser and monitoring the network activity while the page is loading. That might reveal that the site is using dynamic Ajax queries to populate the price data, and you could then write code to interact with the Ajax resource directly.
It's also possible that the price data is embedded somewhere in the HTML, possibly obfuscated, and then loaded dynamically by javascript.
That's just a couple of suggestions. You will need to analyse the site to see whether automated scraping is feasible. If you can let us know what website you're dealing with then someone might be able to suggest something more specific.
I am a non-programmer working for a church. We have no tech staff. Our website is based upon a template that doesn't provide a widget for counting clicks. We'd like to add one (or preferably two) jpg image(s) with a counter(s) to track the number of times clicked, and display the cumulative total next to the jpg(s). Church members will go to the page and click each time they participate in one or both of two different church objectives.
Our web host says to do this I must find, write, or purchase 3rd party code written in iframe, to embed into one of our pages.
I googled the issue and am only finding hit counters which track visitors to a page, rather than clicks of an image. We'd prefer two different jpgs to track two different objectives, but if necessary I can change from two jpgs to one, if having two counters on the same page is a problem.
Can anyone point me to where I could get code like this either for free, or for pay, and what it would cost?
There is a lot of good information here. They talk about an issue with iframe receiving the click vs. you recording it. If you keep reading there is a possibility to work it. Hope this helps!
Look here: Detect Click into Iframe using JavaScript
I'm working on a survey in Qualtrics, and am needing to make certain fields readonly (I'm currently doing this via $j("#QR\\~QID186\\#1\\~14\\~1\\~TEXT").attr("readonly", true); for any IDs I know).
I'm currently getting certain IDs by previewing the survey in my browser and inspecting the element - no problem (as pointed out in their documentation). This is working for most fields, except ones with a lot of display logic on them. The display logic is taking into account previous answers and doing calculations on them - it's a little obfuscated as I don't know the content of the survey that well. So for these fields, it's incredibly tedious just to get the field to display in the first place by manipulating my input data.
It doesn't explain if it's even possible in their documentation, so it may be a long shot, but is there actually a way for me to get the ID of an element if I'm only in the "Edit Survey" section of Qualtrics, without having to preview it?
In the top right corner of the Qualtrics screen where you have your name, the drop down from that allows you to look at account settings. Once in this menu, you can select "Qualtrics IDs" in the Grey Ribbon. From there select the survey you want to examine and a separate window will pop up with all the Question IDs associated with that survey.
I am doing simple web application with a little business logic. Now I have Drop down list with about 25 000 product and user can choose it.
The application will be probably slow for users who has slow internet speed. (in company it`s ok)
Is there any component (in Visual Studio) or what is the best way to server so many product to users?
I also try with ComboBox ajax, but in IE 8 CPU was unable to process.
Is there a reason that you need to display all 25,000 items at once? I imagine that this will be a usability issue even if it works flawlessly. With such a massive list, users already must have some sort of idea of what they are choosing.
How about a simple text box that uses ajax to drop down suggested results (similar to google search)?
Edit
You could also break your items into multiple categories and then have a drop down list of categories. Once the user chooses a category, a second drop down list can display all items in that category or maybe something to break the category down even more. Similar to: http://www.kbb.com/whats-my-car-worth/
I have this GUI that shows, let's say Customer Orders. When my client nailed down the requirements, he asked me to keep pagination like this,
Show Items Per Page : 10/50/150
For each customer there could be thousands of orders and each order will have atleast 50 attributes to show up in the screen. So, assume a 50 column html table with 2000 or 3000 records associated with it spanning across multiple database tables (anyway, this is a different story)
Things were breeze until yesterday, now my client has come up with new change requests, in that he specified Show Items like this,
Show Items Per Page : 10/50/150/All
Yes, he wanted to see 2000 or 3000 records by just select "All" option. Internally, this is not a big change, I would go back and remove the filters I apply on rowcount etc., but when it is loaded in GUI it really sucks ... view state was huge etc., etc.,
I know, this is a standard problem. How you guys deal with it? I cannot convince my client to remove this "All" option, he got stick to it. (the reason is simple, he got a big 42" screen where he can easily see 1000 items in one page)
I also tried to use javacript to prepare DOM in a ajax call .. but still, inserting 2000 TDs is really slow.
Any help is greatly appreciated.
Some Extra Info
This application is a intranet application or else accessed through VPN connection
This problem is about browser performance.
I suppose you can do two things.
1) you can use <div> instead of <table> (this is possible with CSS) because browser do not render table until closed tag. So it will take long to load page but it will render first results faster.
2) If you use Ajax+Json and render every <tr> piece by piece, you can render whole thing and only than put it in DOM. That will be faster because browser will not render every time you put another row
If you want you can load the data in sort of installments. Sort of like how pagination works but it is not quite pagination to be precise. You can label your installments/pages with a proper ID. Load the page one after another via ajax calls. You can even show a progress bar to show how much data is actually loaded. Append this data to the table you are displaying the data in. I would not go about using server controls for this...you have to handle this via javascript or jquery.
You might want to append table rows incrementally.
When client scrolls close to page bottom - fire an ajax call, return next page and render it.
But best solution would be to convince your client - this is not how web applications works. We had similar situation - pure nightmare.
Instead of an ASP.NET GridView, you'd be better to use a DataRepeater.
Better yet, if you are not constrained by technology, you can use Microsoft Ajax Preview 4 with WCF REST Services. You would just need to find some hacks to "stream" data from the service and display it.
Also there is JQuery Grid (if you don't want to use Microsoft Ajax Preview 4) that supports JSON serialization.