lets imagine i'm having this query in graphite:
level1.level2.*.level4.count
and I want my alias to be
level is: level3 but simply I can't find the way.
After reading graphite docs for few hours, I still can't find simple way to do it.
Even more challenging would be to do regex function ex:
aliasByRegex(
query,
'level is: level$1',
'/level2.level[0-9].level4/')
Is it possible to do something like this in graphite?
You can simply use aliasByNode to extract the exact level and then aliasSub
aliasSub(aliasByNode(level1.level2.*.level4, 2), "(.*)", "level is \1")
Of course you could use aliasSub alone, but imo it would make the regex the way complicated.
Related
On input consider db-dump(from dbeaver), having this format:
{
"select": [
{<row1>},
{<row2>}
],
"select": {}
}
say that I'm debugging bigger script, and just want to see first few rows, from first statement. How to do that effectively in rather huge file?
Template:
jq 'keys[0] as $k|.[$k]|limit(1;.[])' dump
isn't really great, as it need to fetch all keys first. Template
jq '.[0]|limit(1;.[])' dump
sadly does not seem to be valid one, and
jq 'first(.[])|limit(1;.[])' dump
does not seem to have any performance benefit.
What would be the best way to just access first field in object without actually testing it's name or caring for rest of fields?
One strategy would be to use the —stream command-line option. It’s a bit tricky to use, but if you want to use jq or gojq, it’s the way to go for a space-time efficient solution for a large input.
Far easier to use would be my jm script, which is intended precisely to achieve the kind of objective you describe. In particular, please note its —-limit option. E.g. you could start with:
jm -s —-limit 1
See
https://github.com/pkoppstein/jm
How to read a 100+GB file with jq without running out of memory
Given that weird object with identical keys, you can use the --stream option to access all items before the JSON processor would eliminate the duplicates, fromstream and truncate_stream to dissect the input, and limit to reduce the output to just a few items:
jq --stream -cn 'limit(5; fromstream(2|truncate_stream(inputs)))' dump.json
{<row1>}
{<row2>}
{<row3>}
{<row4>}
{<row5>}
When I type the string string "cli", I get results like "client 1", "client 2", etc.
But when I type "lie", I dont get any results. It seems that wild cards are added only at the end.
How do add this feature to my site?
It's not supported by ZCTextIndex, see http://docs.zope.org/zope2/zope2book/SearchingZCatalog.html#searching-zctextindexes
I fear that also switching to search using Solr (through collective.solr or other integrations) will not help.
Products.TextIndexNG3 (https://pypi.python.org/pypi/Products.TextIndexNG3) supports wildcards at the beginning and much more.
it's even possible to define synonyms yourself - so a search for 'kitten' also returns documents containing 'cat'
works fine for plone4.x, didn't try for plone5. and as #keul mentioned there is not much development going on for this addon too because the trend is to use specialized search-services
I am trying to retrieve the data from amazon. the url is here.
http://www.amazon.com/Logitech-Wireless-Marathon-3-year-Battery/product-reviews/B003TG75EG/ref=cm_cr_dp_see_all_summary?ie=UTF8&showViewpoints=1&sortBy=byRankDescending
It is a product review page. I find that the data is between these two tags as below
<div style="margin-bottom:0.5em;">
395 of 405 people found the following review helpful
</div>
The problem is that other info are also contained between these two tags. Does anyone have some good idea to retrieve these data?
Thank you.
Your question is a unclear, but I would guess you actually want to get back the 395, not the whole text.
You can get back the element like so (which I think is a better solution as markup and class names can easily change, but the ID recMHRL will likely stay)
/div[#id = "revMHRL"]/div/div/span[contains(#class, "a-size-small")][contains(#class, "a-color-secondary")]
and extract the number you can do
tokenize(normalize-space(/div[#id = "revMHRL"]/div/div/span[contains(#class, "a-size-small")][contains(#class, "a-color-secondary")]/text()), "\s+")[1]
This removed leading and traling white spaces first and then tokenizes the strings based on whitespaces, returning back only the first element.
I assume you want to extract from the first review.
Also, I assume you only have XPATH 1.0 functions and not XPATH 2, therefore no tokenize function available.
First, the expressions suggested so far rely too much on the structure of the page, that amazon changes frequently. That means the same can fail in few days. A better expression to select the node you want is
//*[#id='revMH']/h3/following::node()[contains(text(),'people found the following review helpful')][1]
because it's unlikely that amazon will change the text showed to user.
Once we have that, to extract the 395 you can use:
substring-before(//*[#id='revMH']/h3/following::node()[contains(text(),'people found the following review helpful')][1]," of")
In case you want 395 of 405 just use substring-before(.....,' people'), and then split the two numbers in your host language.
You can even use translate to get a text like e.g., 395 / 405, with
translate(normalize-space(//div[#id = "revMHRL"]/div/div/span[contains(#class, "a-size-small")][contains(#class, "a-color-secondary")]/text()),"of",'/')
please try this xpath
//div[#class='a-section']/div[#class='a-row a-spacing-micro']/span[#class='a-size-small a-color-secondary']/text()
Which one is preferred in terms of performance?
a[href*="op.ExtSite.com/p"]
a[href*="shop.ExtSite.com/page"]
a[href^="http://shop.ExtSite.com/page"]
a[href^="http://shop.ExtSite.com/page"][href$=".html"]
Update
The last selector should have been written as follow:
a[href^="http://shop.E"][href$=".html"]
Also, regarding this multiple selector, I would like to know which condition is checked first, the left one or the right one?
My guess is either this one
a[href^="http://shop.ExtSite.com/page"]
or
a[href^="http://shop.ExtSite.com/page"][href$=".html"]
as it starts looking from the first of the string so all links that does not have h in the beginning will be avoided.
UPDATE
if you need to check on the full pattern then go with the one I mentioned below :
a[href^="http://shop.ExtSite.com/page.html"]
Is there some kind of SQL Statement that I can used to do a search on 1 Column in my table that is similar to what I am looking for.
Like if I am looking for something to do with a "Car" but I spell it as "Kar" it must return items of "car".
Or
If I am looking for "My Company" and I spell it as "MyCompany" it must still retun "My Company".
Select * from TableName where Column Like '%company%' will return "My Company" but I need more as the user will not always know how to spell. So I need something more advanced like some small Google App or something...
That feature is in the text services so if you build a full-text search index, you should be able to use the feature.
Have a look here:
http://msdn.microsoft.com/en-us/library/ms187384.aspx
http://msdn.microsoft.com/en-us/library/ms142571.aspx
This is quite an involved problem. The quick answer is to use the SQL Server soundex algorithm, but it's pretty hopeless. Try the suggestions on this SO answer. (Updated)
Read this blog post: http://googlesystem.blogspot.com/2007/04/simplified-version-of-googles-spell.html
This is something you could implement with SQL, but it's not a built in feature.
Another way to help you users find what they are looking for is to implement type-ahead on the search field. If the user type: "my" he will get "My Company" as a suggestion and likely go with that.
You can easily implement type ahead using jquery or other javascript libraries. Here's a list of type ahead plugins for jQuery: http://plugins.jquery.com/plugin-tags/typeahead
No. A full text index might get you closer, depending on your requirements (what exact features are you looking for?) One option would be roll you own .NET assembly with the desired features add it (CREATE ASSEMBLY) to sql server and use that to search.