everybody, this is probably has a simple answer which I am overlooking (I am still learning).
I am trying to scrape data from a website, I am specific after particular p elements, which are nested inside different elements, this is what the nested elements look like.
#ctl00_body_divSearchResult > div:nth-child(5) > div.expandable-box-content.expanded > p:nth-child(2)
#ctl00_body_divSearchResult > div:nth-child(16) > div.expandable-box-content.expanded > p:nth-child(2)
#ctl00_body_divSearchResult > div:nth-child(27) > div.expandable-box-content.expanded > p:nth-child(2)
#ctl00_body_divSearchResult > div:nth-child(38) > div.expandable-box-content.expanded > p:nth-child(2)
#ctl00_body_divSearchResult > div:nth-child(49) > div.expandable-box-content.expanded > p:nth-child(2)
Here are five examples from the same page, the first div:nth-child has different numbers, but the rest is consistent. I am after the individual p:nth-child(2) elements.
Using this code I can get the individual p elements,
numbers= agent.get(urlanzctr).css('#ctl00_body_divSearchResult > div:nth-child(5) > div.expandable-box-content > p:nth-child(2)').text
But I think it would be sloppy coding to go through and repeat this for each individual instances.
Your agent.get(url).css(selector) approach is right to return an array.
Looking at your selectors, they're all of this structure:
#ctl00_body_divSearchResult >
div:nth-child(N) >
div.expandable-box-content.expanded >
p:nth-child(2)
The only variable being the N in div:nth-child.
You have the values 5, 16, 27, 38, 49
which is 11x + 5
So you could do something like this
def make_selector(n)
"#ctl00_body_divSearchResult > " +
"div:nth-child(#{n}) > " +
"div.expandable-box-content.expanded > " +
"p:nth-child(2)"
end
def get_matches(n)
agent.get(url).css(make_selector(n))
end
starting_idx = 5
current_matches = get_matches(starting_idx)
all_matches = []
until current_matches.empty?
all_matches.concat(current_matches)
current_matches = get_matches(starting_idx + 11)
end
puts all_matches.length
You might also be able to skip out the intermediary selectors
i.e. maybe just .expandable-box-content.exampled > p would work; i have no idea what the page structure looks like.
Related
This loop is going over all the values of i in range(92:1000) and whichever value of i is holding the condition true it is breaking the loop by setting that value of i in c and when i am trying to run this code block in R language it is giving me c=1000.
> c=0
> for (i in range(92:1000)){
+ if(dpois(i,94.32)<=dpois(5,94.32))
+ {c=i;
+ break;
+ }
+ }
> c
[1] 1000
But what i expected it should give value of c=235 as at i=235 as:--
> dpois(235,94.32)
[1] 2.201473e-34
> dpois(5,94.32)
[1] 6.779258e-34
> dpois(235,94.32)<=dpois(5,94.32)
[1] TRUE
And it should break whenever the condition is true for the first time.
Where am i going wrong ?
In R, range computes the range of the given data, i.e. the minimum and maximum
> range(92:1000)
[1] 92 1000
Also, using c as a variable name is very bad practice in R. Since c is an intrinsic function used to define vectors.
The following gives the expected answer
> c0=0
> for (i in 92:1000){
+ if(dpois(i,94.32)<=dpois(5,94.32))
+ {
+
+ c0=i
+ break
+
+ }
+ }
> c0
[1] 234
I'm trying to do many conditional events in R but im getting the warning:
Warning messages:
1: In if (closeV > openV) { :
the condition has length > 1 and only the first element will be used
2: In if ((highV - closeV) < Minimum) { :
the condition has length > 1 and only the first element will be used
3: In if ((openV - lowV) > Threshold) { :
the condition has length > 1 and only the first element will be used
4: In if (((openV - lowV) < Threshold)) { :
the condition has length > 1 and only the first element will be used
5: In if ((closeV - openV) < Threshold) { :
the condition has length > 1 and only the first element will be used
6: In if ((closeV - lowV) < (Threshold * 2)) { :
the condition has length > 1 and only the first element will be used
this is a huge nest of ifs, it is not optimized right now but i cant get it to work because of that warning.
There are around of 40 ifs in that function, any idea of what i need to do to get around this warning?
The code looks something like this
if(closeV>openV)#1 First we check if we have a positive value
{
if((highV-closeV)<Minimum)
{
if((openV-lowV) >Threshold)
{
if((closeV-openV)<Threshold)
{
#3.1 This is a Hammer with positive movement
if((closeV-lowV)<(Threshold*2))
{
#3.1.1 not much movement
return(X*2)
}
else if((closeV-lowV)>(Treshold*2))
{
#3.1.2 a lot of movement
return(X*3)
}
}
else if((closeV-openV)>Threshold)
{
#3.2 Hammer but with a lot of movement
if((closeV-lowV)<(Threshold*2))
{
#3.2.1 not much movement
return(X)
}
else if((closeV-lowV)>(Treshold*2))
{
#3.2.2 a lot of movement
return(X*5)
}
}
}
else if(((openV-lowV)<Threshold)
and it keeps on going through a lot of possibilites
The issue is not the nested if-statements, but rather the data structure you feed into them: The warning tells you that the comparison operator is only applied to the first element of the data structure you feed into the if-statements.
While
a = seq(1, 10, 1)
b = seq(0, 18, 2)
if (a>b){
print(a)
} else{
print(b)
}
throws the same warning messages you get,
a = seq(1, 10, 1)
b = seq(0, 18, 2)
for (i in 1:10) {
if (a[i]>b[i]){
print(a[i])
} else{
print(b[i])
}
}
in contrast evaluates smoothly.
Also, please notice that although both pieces of code are evaluated, they give very different results.
i used following tutorial for learning R programming language
R programming language
v <- c("Hello","loop")
cnt <- 2
repeat {
print(v)
cnt <- cnt+1
if(cnt > 5) {
break
}
}
but when i run this code, it gave me the following error
> if(cnt > 5) {
+ break
+ }
Error: object 'cnt' not found
>
this example was taken from tutorial itself, what is wrong in given code?
I am using the tds data-structure for some data manipulation purposes in Torch. I would like to know how can I select a subset of the value of this data structure.
eg. In Python,
> a = [1,2,3,4,5]
> a[1:3]
[2,3]
In Lua/Torch,
> a = torch.Tensor({1,2,3,4,5})
> a = [{{1,3}}]
1
2
3
What is the equivalent of the above operations in tds, if any?
Game: My game is a simple game that takes a list of words from a txt file and puts them onto a grid. Then the words are shuffled (there are 9 words displayed on a 3*3 grid and one is replaced by the spare word not used), then the user has to guess what the word that has been replaced was and what the word that replaced it was too. If the user is correct they then move onto a harder level which is a 4*4 grid.
Issue: I have been trying to verify inputs from a user by checking the words in the list by the position of the word that's been shuffled, so I am trying to check which word is in the tenth position of the list as that is the word that has been replaced.
Code Scripts:
"Global_Variables" -
> globalvar WordCount; globalvar WordColumn; globalvar WordRow;
> globalvar WordList; globalvar GridList; globalvar LineGap; globalvar
> WildCard; globalvar BoxSize; globalvar BoxIndent; globalvar BoxHeader;
> globalvar TimerIndent;
"Readfile" -
> filop = file_text_open_read(working_directory + "Words.txt");
> wordgridlist = ds_list_create(); gridcreate = ds_list_create();
> while(!file_text_eof(filop)){
> line = string_upper(file_text_readln(filop));
> ds_list_add(wordgridlist, line);
> } file_text_close(filop); wordgridlistshuffled =
> ds_list_shuffle(wordgridlist) "Output" - draw_set_colour(c_red)
> draw_set_font(Text_Font) Text_Grid = 0 for (X=0; X<3; X+=1){
> for (Y=0; Y<3; Y+=1){
> draw_text((X*710)+250,
> (Y*244)+300,ds_list_find_value(wordgridlist,Text_Grid));
> Text_Grid +=1
>
> }
> }
"Word_Question" -
> WordChangedEasy = get_string("What word changed?", "");
> WordChangedEasyAnswer = ds_list_shuffle(10); WordReplacedEasy =
> get_string("What word has been replaced?", "");
I've sourced this from the GameMaker: User Manual.
ds_list_find_value
Finds the value held at a given position in the list.
Syntax:
ds_list_find_value(id, pos);
id: The id of the list to use.
pos: The position to look at, where 0 corresponds to the very beginning of the list and the final position is ds_list_size(id)-1.
You should use ds_list_find_value(wordgridlist,9) to find the tenth value.