IMPORTXML function in Google Sheets

IMPORTXML function in Google Sheets - web-scraping

Using the IMPORTXML function, is it possible to construct an XPATH query that pulls the Industry value for a given Wikipedia page?
For example, the value I want to pull from this page - https://en.wikipedia.org/wiki/Target_Corporation - is "Retail" whereas on this page - https://en.wikipedia.org/wiki/Boohoo.com - it would be "Fashion".

You want to create the xpath for retrieving the Industry value for a given Wikipedia page.
If my understanding is correct, as other pattern, how about the formula with this xpath? Please think of this as just one of several answers.
Sample formula:
=IMPORTXML(A1,"//th[text()='Industry']/following-sibling::td")
The xpath is //th[text()='Industry']/following-sibling::td.
In this case, the URL of https://en.wikipedia.org/wiki/Target_Corporation or https://en.wikipedia.org/wiki/Boohoo.com is put in the cell "A1".
Result:
Reference:
XPath Axes
Added:
From your replying, I knew that you want to add 2 more URLs. So all URLs are as follows.
https://en.wikipedia.org/wiki/Target_Corporation
`https://en.wikipedia.org/wiki/Boohoo.com
`https://en.wikipedia.org/wiki/Woot
`https://en.wikipedia.org/wiki/TripAdvisor
Issue and workaround:
For above URLs, when the formula of =IMPORTXML(A1,"//th[text()='Industry']/following-sibling::td") is used, Retail, Fashion, Retail and Travel, services are returned.
When the xpath is modified to //th[text()='Industry']/following-sibling::td/a, Retail, #N/A, #N/A and Travel are returned.
The reason of this is due to the following difference.
<tr>
<th scope="row">Industry</th>
<td class="category">Travel services</td>
</tr>
and
<tr>
<th scope="row" style="padding-right:0.5em;">Industry</th>
<td class="category" style="line-height:1.35em;">Retail</td>
</tr>
and
<tr>
<th scope="row" style="padding-right:0.5em;">Industry</th>
<td class="category" style="line-height:1.35em;">Fashion</td>
</tr>
By this, I think that unfortunately, in order to retrieve Travel, Retail and Fashion from above, those cannot be directly retrieved with only one xpath. So I used a built-in function for this situation.
Workaround:
In this workaround, I used INDEX. Please think of this as just one of several answers.
=INDEX(IMPORTXML(A1,"//th[text()='Industry']/following-sibling::td"),1,1)
The xpath is //th[text()='Industry']/following-sibling::td. This is not modified.
In this case, the URL is put in the cell "A1".
When 2 values are retrieved, the 1st one is retrieved. By this, I used INDEX.
Result:

try:
=INDEX(IMPORTXML("https://en.wikipedia.org/wiki/Boohoo.com",
"//td[#class='category']"), 2, 1)
=INDEX(IMPORTXML("https://en.wikipedia.org/wiki/Target_Corporation",
"//td[#class='category']"),2,1)

Related

Append records to existing bootstrap table by wenzhixin

What I did
On page load we loaded 100 records (per page 10 records).
<table id="all" class="table table-hover table-condensed"
data-toggle="table"
data-toolbar="#toolbar"
data-url="/retrieveusers"
data-search="true"
data-multiple-search = "true"
data-trim-on-search="false"
data-show-export="true"
data-pagination="true"
data-maintain-selected="true"
data-method="post"
data-query-params="postQueryParams"
data-page-size="10"
data-page-list="[10,20,30,50,75,100,200,250,500]">
Actual Problem
When we reach 10th page i.e. 100th record, we want to append next 100 records to the existing table.
Please suggest me a solution for this problem.

There is 'append' method available in documentation. Example is here http://issues.wenzhixin.net.cn/bootstrap-table/#methods/append.html You may catch 'onPageChange' event and then load next portion of your data and append it to the table.

How to click on table record by.cssselector with partial text present in code snippet

I have a table with header Task details and TaskTime. Task details column contain some text (Perform Operation roll no. 150) and onclick there is java script operation is performed and new page is displayed in same window.
I want to click on particular table cell say 2nd or 3rd or any cell using partial text roll no. by using findelementBy.cssselector.
Below is my code snippet for table which gets frequently updates and different roll no. gets added to table
<tr class="Pointer" onmouseover="this.style.cursor='hand'" title="Perform" style="">
<td onclick="javascript:__doPostBack">Perform Operation roll no. 150</td>
<td onclick="javascript:__doPostBack">07 Jul 2015 05:26 PM</td>
<tr class="Pointer" onmouseover="this.style.cursor='hand'" title="Perform" style="">
<td onclick="javascript:__doPostBack">Perform Operation roll no. 161</td>
<td onclick="javascript:__doPostBack">07 Jul 2015 05:18 PM</td>
<tr class="Pointer" onmouseover="this.style.cursor='hand'" title="Perform" style="">
<td onclick="javascript:__doPostBack">Perform Operation roll no. 155</td>
<td onclick="javascript:__doPostBack">07 Jul 2015 05:13 PM</td>
Note my xpath is not working properly and below table will frequently changes.

Please try the below xpath:
//td[contains(text(),'roll no. 161')]
This will locate the td element that has innerHTML/text as roll no. 161. You can always replace 161 with any other roll number you want to locate.
OR
In case you want to click on 2nd, 3rd or any other td element with respect to the text roll no. on the list, you can try these:
(//td[contains(text(),'roll no.')])[2]
This will locate the 2nd td element in the chronological order of DOM representation that contains innerHTML/text as roll no..
(//td[contains(text(),'roll no.')])[3]
This will locate the 3rd td element in the chronological order of DOM representation that contains innerHTML/text as roll no..
Similarly, you can replace the last [2] with any other number to locate the concerned element in the page.

After going deep into css I am able to locate element inside selenium ide
css=tr.Pointer > td:contains(roll nos)
Above roll nos will change frequently so is the element locator.

Ruby/Selenium Access property of cells in a table that frequently changes

I am trying to access a property of cells in a table.
<table id="m-103" class="m-row" cellspacing="0">
<a name="2"></a>
<table id="m-108" class="m-row " cellspacing="0">
<a name="3"></a>
<table id="m-191" class="m-row " cellspacing="0">
<tbody>
<tr>
<td class="m-st">
<td class="m-jk m-N">
</td>
</td>
</tr>
</tbody>
</table>
This is the xpath I have so far
.//*[#class='m-row']/tbody/tr/td[#class='m-jk']
but it will only access the cells in the first table.
I am interested in the m-N class value. Not every table has the m-N value. I'm only interested in the ones that do. Is there a way to only check tables that contain "m-N" or do I have to go through each one and check and if so how do I do that? I know now only how to go to specific paths so I have no clue how to iterate through each table.
How do I access the second class value "m-N"? Every css or xpath Iv'e used does not work, and again they are only for a predetermined table.
I saw an answer but the person was using jquery? Is this something I should learn and use as well? Can I if I'm using Ruby and Selenium?
How to get the second class name from element?
There are many more tables this is only 3 of them I'm showing for the example. Also the number of tables and cells changes frequently.

To get the td elements which have a class attribute which contains m-N you can use the xpath function contains().
Try this:
"//td[contains(#class, 'm-N')]"
This could get a little bit more complex if there also other classes which contains 'm-N' like 'm-Nx'. Than you have to do something like this:
"//td[contains(concat( ' ', #class, ' '), ' m-N ' )]"

how to edit fields which are stored in textfile using classic asp

I have two fields which are taken in two variables that are StrcategoryName and StrcategoryUrl means category name is stored in StrcategoryName and url is stored in StrcategoryUrl which are come from text file and then bind to input type text
now i want two edit these data in textfile,
which are stored in text file. Now I want to edit these two fields
<tr>
<th align="left" valign="middle" class="text" scope="row"><b>Enter Category Name:</b></th>
<td align="left" valign="middle"><input type="text" name="txtcategoryname" value="<%=StrcategoryName%>" /></td>
</tr>
<tr>
<th align="left" valign="middle" class="text" scope="row"><b>Enter Url Of Category Image :</b></th>
<td align="left" valign="middle"><input type="text" name="txtName" value="<%=StrcategoryUrl%>"/></td>
</tr>
I have tried following code but I am getting an error.

You need to use the "Scripting.FileSystemObject" to read/write textfiles using ASP. Most of the examples you will find on the web show you the contents of these text files by using Response.Write to display the values.
One of the better sites for learning how to read/write textfiles files is located here.
It sounds like you need to do for 4 things:
1. Read the text file using the Scripting.FileSystemObject and store the data into ASP variables
2. Place these variables into an HTML form that posts to an update page
3. On the update page obtain the updated values using the Request object
4. Save these new values to the text file using the Scripting.FileSystemObject

Text manipulation is not easy, because it's always treated as a collection of lines that we need to parse line by line or know exactly what line holds what ... and someone come along and change the order ... it's not pretty!
That is why we always use XML to store data into files as the Framework gives us much more control on what is stored and it's easy to remove, edit, insert or delete something.
in Classic ASP, you have several tutorials that help you read and write XML files all over the web and several questions here at SO.
Asp XML Parsing
ASP Classic - XML Dom
and one from my favorite site back the days
Accessing XML Data using ASP

asp.net custom grid vs GridView/ListView

A few years ago, I decided to create my own DataGrid as I didn’t like the standard one provided by Microsoft. It’s a very simple function which takes a DataTable as an input parameter and which returns a string (the html code to display a table on a webpage).
It’s very flexible (there are some optional parameters to do the paging, sorting and to format each column the way I want) and fast (only the records which are used are retrieved from the database).
The function itself is very short (about 20ish lines of code). I’ve been using it for at least 4 years now.
Assuming you have a PlaceHolder on your webpage, this is how you would call the custom function:
MyPlaceHolder.Controls.Add(new LiteralControl(CreateCustomGrid(MyDataTable)))
CreateCustomGrid(MyDataTable))) would return something like this (if MyDataTable has 2 columns and 2 rows):
<table class="MyClass" rules="all">
<tr>
<th>Column1</th>
<th>Column2</th>
</tr>
<tr>
<td align="center">Value1</td>
<td align="center">Value2</td>
</tr>
<tr>
<td align="center">Value3</td>
<td align="center"><a href=’MyLink’>Value4</a></td>
</tr>
</table>
Internally the function knows how to format each column (this function is only being used on one website) but it’s also possible to change it for each individual column by using optional parameters. Same for the paging and sorting. All in all it’s very flexible and very easy to use.
Now things have changed and the DataGrid has been replaced by the GridView and/or ListView. I’ve looked at them but I don’t see anything that they do that my function doesn’t so I would be tempted to carry on using my function but I might be overlooking something. At the same time, it looks a bit odd to carry on using a custom function to generate an html table. What’s your views on this?

If your code works and is seasoned, I wouldn't change any existing code. For additional functionality, you might want to consider wrapping it in a custom data-bound WebControl. That way you can use datasources, etc.

I'd say that you should investigate the GridView/ListView to see what they can do but, ultimately, if you and your customers are happy with your own code and it works for you, does everything you need it to then there's no need to change just because there is something else out there.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

IMPORTXML function in Google Sheets - web-scraping

try: =INDEX(IMPORTXML("https://en.wikipedia.org/wiki/Boohoo.com", "//td[#class='category']"), 2, 1) =INDEX(IMPORTXML("https://en.wikipedia.org/wiki/Target_Corporation", "//td[#class='category']"),2,1)

Related

Append records to existing bootstrap table by wenzhixin

How to click on table record by.cssselector with partial text present in code snippet

Ruby/Selenium Access property of cells in a table that frequently changes

how to edit fields which are stored in textfile using classic asp

asp.net custom grid vs GridView/ListView

Categories

Resources