Data missing while scraping website - web-scraping

I am trying to scrap a website (Please refer to urls in the code).
From the website ,i am trying to scrap all the information and transfer the data to json file.
scrapy shell http://www.narakkalkuries.com/intimation.html
To extract the information from website
response.xpath('//table[#class="MsoTableGrid"]/tr/td[1]//text()').re(r'[0-9,-/]+|[0-9]+')
I am able to retrieve most of the information from the website.
Concern:
Able to scrap data under "Intimation",expect'Intimation For September 2017' not able to scrap information under this tab.
Finding:
For 'Intimation For September 2017', the value is stored in the span tag
/html/body/div[4]/div[2]/div/table/tbody/tr[32]/td[1]/table/tbody/tr[1]/td[1]/p/b/span
For the remaining month the values are stored in the font tag
/html/body/div[4]/div[2]/div/table/tbody/tr[35]/td[1]/table/tbody/tr[2]/td[1]/p/b/span/font
How to extract information for "Intimation For September 2017" ?

You tables use different #class (MsoTableGrid and MsoNormalTable) so you need some way to process all of them:
for table in response.xpath('//table[#width="519"]'):
for row in table.xpath('./tr[position() > 1]'):
for cell in row.xpath('./td'):
#you can stringify value
cell_value = cell.xpath('string(.)').extract_first()

Related

Accessing imported data in google appmaker

I have managed to create a data model, and in doing so import data variables and values from my googlespreadsheets (by clicking on 'import data from sheet'). My table is of the form:
table = {"age": {15,22}, "name": {"ted", "sally"}, "surname":{"anderson","medina"}}
I would like to have a table that uploads this on a page, and then I can click on each row which would then open up another page and I could edit the contents
I open a new page and drag a table in and link it to my data model. However, it only shows the variable titles and not the actual data. How can I get the data to automatically upload into the table?
I read alot of the tutorials but they all assume I want to start with only headers, and then manually input the data to create a data entry table. However, my table updates automatically in googlesheets, so I would like to import it to appmaker, and then have the ability to click on each row and add notes/edits. Any help would be greatly appreciated!! thanks
At this time App Maker doesn't support spreadsheets as data backend. You'll need first import all data to App Maker's models and then play with it in deployed application. You can find all pieces of the puzzle in Vendor Ratings template:
https://developers.google.com/appmaker/templates/vendor-rating/
Your actual data won't come up in your editor view, only when you preview it.
It seems to me that you could make this using an embedded Google sheet in your page.
Ex:
-drag an html block on the page
-set the content to something like:
<iframe src="[link to your google sheet]" height="1380" width="1100"></iframe>
*get the embed link from 'Publish to the Web'
Hope that helps, it might not be what you're looking for.

Creating own Report in Odoo: t-field for date, Customer ID etc.?

First of all I'm kinda new to Odoo and I'm trying to understand some Basic logic. I created my own Report based on the Basic Report of Odoo.
There are a lot of fields like t-field="o.date_invoice" or t-field="o.partner_id etc. which work really fine but where can I find all functions? Is there any list?
For Example I Need a Field for the order date and for the print date or for a Customer ID.
With a t-field attribute you can access and print fields from the actual model or from a related model, for example with the following element you can print the content of the phone column (field) of the actual record:
<span t-if="o.phone"
t-field="o.phone" />
Explanation of t-field in the documentation:
The t-field directive can only be used when performing field access
(a.b) on a "smart" record (result of the browse method). It is able to
automatically format based on field type, and is integrated in the
website's rich text edition.
Check this link for further information if you want to build reports and this one, where you can read about some the elements that you can use in Qweb
In addition, you can check here a list of some attributes that you can use in a Qweb template

how to call a web service in ms excell 2013

Can any body help me with the code how to call a web service in ms excel 2013 please? i was trying to use by using WebService function in excel but that is not working for me.
To learn how to use the Webservice function, we’ll do 2 things:
Use a =WEBSERVICE(url) function to get the data
Use the =FILTERXML(xml, xpath) function to extract a single piece of data from the XML string
Use a =WEBSERVICE(url) function to get the data
First, find a web service. For this example with weather updates, go to http://www.wunderground.com/weather/api to create your free account. Complete the form, then click Signup for API Key.
To set up your API Key, follow these steps:
Select either the Cumulus Plan or the Anvil Plan, whichever you prefer.
Choose whichever option you prefer for the History add-on. Either option will work for this example because we’re not using historical information.
Select Developer. Note: The other available options also will work for this example, but note that there is a fee associated with them.
Click Update Plan.
At the top of the page, click Documentation.
On the left navigation bar titled API Table of Contents, find the Data Features heading, then under that heading, click conditions. (You can also go to http://www.wunderground.com/weather/api/d/docs?d=data/conditions)
Scroll to the bottom of the page, then copy the URL shown in the box labeled Examples. (The URL format will look like this: http://api.wunderground.com/api/[APIKey]/conditions/q/CA/San_Francisco.json). The sample URL will include your unique API Key.
Now that you have a unique API Key, open your Excel spreadsheet and follow these steps to create the =WEBSERVICE(url) function for the current weather conditions:
In cell B5, enter =WEBSERVICE(url). Then replace url with the unique URL including your API Key that you copied a moment ago.
Add quotation marks to both sides of the URL. The format will look like this: “http://api.wunderground.com/api/[APIKey]/conditions/q/CA/San_Francisco.json”
Replace the state and city in the URL with a zip code, then add .xml to the end of the URL. The formula in cell B5 should look like this: =WEBSERVICE(“http://api.wunderground.com/api/[APIKey]/conditions/[ZipCode].xml”) The[APIKey] will be your unique API Key, and the [Zip Code] will be for the location where you want weather updates.
Press Enter or Return. The formula will return an XML string from the web service.
You can also use cell references in the Webservice function to update URL parameters, such as your zip code. Here is how to set it up:
In cell B1, paste your API Key. In the Name Box, type APIkey to name the cell.
In cell B2, enter the zip code. In the Name Box, type ZipCode to name the cell.
Create your WEBSERVICE function with cell references. The formula should be in this format: =WEBSERVICE(“http://api.wunderground.com/api/” & APIkey & “/conditions/q/” & ZipCode & “.xml”)
Copy and paste the entire formula into cell B5.
Update your zip code and then you will see the update to your WEBSERVICE Function URL.
Use the =FILTERXML(xml, xpath) function to extract single pieces of data from the XML string
Now that we have the information from the web service in the Excel spreadsheet, we need to extract the pieces of data we want out of the XML, including the name of the city and current temperature and current weather conditions. To extract the data, follow these steps:
In cell B8, enter the =FILTERXML(B5,”//full”) function. This will give you the city name associated with the zip code.
In cell C8, enter =FILTERXML(B5, “//temp_f”) to extract the current temperature in Fahrenheit.
In cell D8, enter =FILTERXML(B5, “//weather”) to see the current weather condition, such as Light Rain.
With the online weather updates, now our camping trip planning collaboration spreadsheet looks like this:
A note on refreshing data
Please note that WEBSERVICE Functions are “non-volatile”, which means they refresh only when:
A referenced cell is edited
The entire workbook is refreshed (CTRL+ ALT + F9)
Remember that you can use this functionality for many different web services over the internet that you can then analyze using Excel.

finding article by date code in google search appliance GSA

The Google Search Appliance goes through and finds out the date of each article when it crawls (last modified date is the default).
However, it doesn't turn up articles when you query by date code.
Is there any way to get the GSA to do this?
(We have a daily broadcast which people often search for by date code. Right now we have to manually put in the 4 most common date codes into the meta-keywords in order for them to be pulled up through a query)
Have you tried using inmeta:date as described in the Search Protocol Reference documentation?
Alternatively, if the date code is in the document content or the URL you could use entity recognition to extract it.
One way to make sure GSA is collecting the document date is to check the search results in XML format and see if tag has the date value. You can see the results in XML format by removing any proxystylesheet parameter in the URL.
If the value of tag is empty then GSA is not getting the document dates.
You can configure the document dates under Crawl and Index > Document Dates (at least at GSA version 7). We are using a meta tag approach. We put a date meta tag to each document/page and tell GSA to use this meta tag to sort the documents. The full list of options are:
URL
Meta Tag
Title
Body
Last Modified
Here are some links that helped me to find answers when dealing with a similar problem:
https://support.google.com/gsa/answer/2675414?hl=en
https://developers.google.com/search-appliance/documentation/64/xml_reference#request_sort_by_date
https://groups.google.com/forum/#!searchin/google-search-appliance-help/sort$20by$20date$20not$20working

Can I copy all tables from a URL?

My task is to copy tables from a public domain and format it later in Word. I have created a software where I just have to enter two values and the table is displayed to me on a web page. Then I have to copy this table into Word.
I was wondering if there was an easier way to achieve this....
I would even like to know if it is possible to store all the values I type to a TXT file or Excel sheet and programmatically copy the displayed web pages to Word.
Please help me and don't down-vote.....
Okay here are the detailed steps:
Open a webpage
Fill in a form with 4 fields
A new webpage opens based on what input you provide
Copy 2 tables from that webpage
Paste the 2 tables in MS Word 2007
Open browser again and go back to previous page
Enter new values in the webpage
Repeat all the steps
P.S There are more than 700 tables to be copied each week
I'm not sure this is what you need...anyway...
If you download the page (programmatically of course) you can parse it as XML (I assume it's a well-formed XML file otherwise you may have to use some dirty trick to find all tables). Then you can put all data on Word (by automation, you can even do all these stuffs from a Word macro, just download the HTML file, "parse" it to find tables and paste that text as HTML).
I would provide some example but it can't really be language-agnostic.

Resources