JsonPath expression filtering based on "contains"? - jsonpath

I have a sample Json response as followed below
I know how to filter it using comparison actions like == or >
e.g. I can use $.books[?(#.pages > 460)] to retrieve books with more than 460 pages or
similarly $.books[?(#.pages != 352)] to retrieve books with not 352 pages.
But how can I filter this Json to retrieve books with title containing "Java" substring or published in 2014 year (actually substring too)?
The sample Json is:
{
"books": [
{
"isbn": "9781593275846",
"title": "Eloquent JavaScript, Second Edition",
"subtitle": "A Modern Introduction to Programming",
"author": "Marijn Haverbeke",
"published": "2014-12-14T00:00:00.000Z",
"publisher": "No Starch Press",
"pages": 472,
"description": "JavaScript lies at the heart of almost every modern web application, from social apps to the newest browser-based games. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications.",
"website": "https://eloquentjavascript.net/"
},
{
"isbn": "9781449331818",
"title": "Learning JavaScript Design Patterns",
"subtitle": "A JavaScript and jQuery Developer's Guide",
"author": "Addy Osmani",
"published": "2012-07-01T00:00:00.000Z",
"publisher": "O'Reilly Media",
"pages": 254,
"description": "With Learning JavaScript Design Patterns, you'll learn how to write beautiful, structured, and maintainable JavaScript by applying classical and modern design patterns to the language. If you want to keep your code efficient, more manageable, and up-to-date with the latest best practices, this book is for you.",
"website": "https://www.addyosmani.com/resources/essentialjsdesignpatterns/book/"
},
{
"isbn": "9781449365035",
"title": "Speaking JavaScript",
"subtitle": "An In-Depth Guide for Programmers",
"author": "Axel Rauschmayer",
"published": "2014-02-01T00:00:00.000Z",
"publisher": "O'Reilly Media",
"pages": 460,
"description": "Like it or not, JavaScript is everywhere these days-from browser to server to mobile-and now you, too, need to learn the language or dive deeper than you have. This concise book guides you into and through JavaScript, written by a veteran programmer who once found himself in the same position.",
"website": "https://speakingjs.com/"
},
{
"isbn": "9781491950296",
"title": "Programming JavaScript Applications",
"subtitle": "Robust Web Architecture with Node, HTML5, and Modern JS Libraries",
"author": "Eric Elliott",
"published": "2014-07-01T00:00:00.000Z",
"publisher": "O'Reilly Media",
"pages": 254,
"description": "Take advantage of JavaScript's power to build robust web-scale or enterprise applications that are easy to extend and maintain. By applying the design patterns outlined in this practical book, experienced JavaScript developers will learn how to write flexible and resilient code that's easier-yes, easier-to work with as your code base grows.",
"website": "https://chimera.labs.oreilly.com/books/1234000000262/index.html"
},
{
"isbn": "9781593277574",
"title": "Understanding ECMAScript 6",
"subtitle": "The Definitive Guide for JavaScript Developers",
"author": "Nicholas C. Zakas",
"published": "2016-09-03T00:00:00.000Z",
"publisher": "No Starch Press",
"pages": 352,
"description": "ECMAScript 6 represents the biggest update to the core of JavaScript in the history of the language. In Understanding ECMAScript 6, expert developer Nicholas C. Zakas provides a complete guide to the object types, syntax, and other exciting changes that ECMAScript 6 brings to JavaScript.",
"website": "https://leanpub.com/understandinges6/read"
},
{
"isbn": "9781491904244",
"title": "You Don't Know JS",
"subtitle": "ES6 & Beyond",
"author": "Kyle Simpson",
"published": "2015-12-27T00:00:00.000Z",
"publisher": "O'Reilly Media",
"pages": 278,
"description": "No matter how much experience you have with JavaScript, odds are you don’t fully understand the language. As part of the 'You Don’t Know JS' series, this compact guide focuses on new features available in ECMAScript 6 (ES6), the latest version of the standard upon which JavaScript is built.",
"website": "https://github.com/getify/You-Dont-Know-JS/tree/master/es6%20&%20beyond"
},
{
"isbn": "9781449325862",
"title": "Git Pocket Guide",
"subtitle": "A Working Introduction",
"author": "Richard E. Silverman",
"published": "2013-08-02T00:00:00.000Z",
"publisher": "O'Reilly Media",
"pages": 234,
"description": "This pocket guide is the perfect on-the-job companion to Git, the distributed version control system. It provides a compact, readable introduction to Git for new users, as well as a reference to common commands and procedures for those of you with Git experience.",
"website": "https://chimera.labs.oreilly.com/books/1230000000561/index.html"
},
{
"isbn": "9781449337711",
"title": "Designing Evolvable Web APIs with ASP.NET",
"subtitle": "Harnessing the Power of the Web",
"author": "Glenn Block, et al.",
"published": "2014-04-07T00:00:00.000Z",
"publisher": "O'Reilly Media",
"pages": 538,
"description": "Design and build Web APIs for a broad range of clients—including browsers and mobile devices—that can adapt to change over time. This practical, hands-on guide takes you through the theory and tools you need to build evolvable HTTP services with Microsoft’s ASP.NET Web API framework. In the process, you’ll learn how design and implement a real-world Web API.",
"website": "https://chimera.labs.oreilly.com/books/1234000001708/index.html"
}
]
}

Raw JSON Path doesn't do this, but depending on the implementation you're using, it might be supported.
We're currently in progress writing a formal specification. One of the open issues is the breadth of expression support we want. Feel free to add a comment. Once the spec is defined and published, I imagine most implementations will update to adhere.

As #gregsdennis said, this is totally implementation-dependent as of now.
In a given JSONPath implementation,
this functionality does not exist
there is a contains() function or similar,
e.g. in Jawways syntax: $..[?(#.title contains 'Java')]
there is a regular expression feature that allow substring matches,
e.g again Jayway's Jsonpath has a =~ regex filter: $..[?(#.title =~ /.*Java.*?/)]
Divert into script evaluation in a filter (JavaScript, absent in most other host languages),
e.g. in Goessner's and any JS implementation:
$..[?(#.title && #.title.includes('Java'))]
Note: includes() is an ES6 function, the actual JSONPath syntax is also a bit idiosyncratic.
Christoph Burgmer maintains a nice JSONPath comparison table.

If you want to extract the value using rest assured you can use this part of the code for getting the needed values.
.get("/endpooint")
.then()
.extract()
.path("books.find{it.title =~ 'Java'}");

Related

Get Twitter handle for an NFT contract address

I'm looking to get the official Twitter handle from an OS Verified project, programmatically.
I've tried calling the "collections" OS API, but the twitter_username field seems to rarely be populated with anything but "null" even for verified projects.
I've tried scraping the data manually with fetch but I got 1020 errors likely due to cloudflare protections.
Has anyone else used Moralis NFT or some other GraphHQ like service to obtain the Twitter handle of a given NFT project (starting out with a contract address)?
You can try Ubiquity API. They have an endpoint for collection metadata which also includes the Twitter account name
https://ubiquity.api.blockdaemon.com/v1/nft/ethereum/mainnet/collection?contract_address=0xBC4CA0EdA7647A8aB7C2061c2E118A18a936f13D
This will give you the following response
{
"collection": {
"id": "4203aedd-7964-5fe1-b932-eb8c4fda7822",
"name": "Bored Ape Yacht Club",
"description": "The Bored Ape Yacht Club is a collection of 10,000 unique Bored Ape NFTs— unique digital collectibles living on the Ethereum blockchain. Your Bored Ape doubles as your Yacht Club membership card, and grants access to members-only benefits, the first of which is access to THE BATHROOM, a collaborative graffiti board. Future areas and perks can be unlocked by the community through roadmap activation. Visit www.BoredApeYachtClub.com for more details.",
"logo": "https://ubiquity.api.blockdaemon.com/v1/nft/media/ethereum/mainnet/collection/4203aedd-7964-5fe1-b932-eb8c4fda7822/logo.png",
"banner": "https://ubiquity.api.blockdaemon.com/v1/nft/media/ethereum/mainnet/collection/4203aedd-7964-5fe1-b932-eb8c4fda7822/banner.jpeg",
"verified": true,
"contracts": [
{
"address": "0xBC4CA0EdA7647A8aB7C2061c2E118A18a936f13D",
"name": "BoredApeYachtClub",
"symbol": "BAYC",
"description": "",
"image_url": "https://ubiquity.api.blockdaemon.com/v1/nft/media/ethereum/mainnet/",
"type": "ERC721"
}
],
"meta": {
"discord_url": "https://discord.gg/3P5K3dzgdB",
"external_url": "http://www.boredapeyachtclub.com/",
"twitter_username": "BoredApeYC"
},
"sub_collection": []
}

How to reverse engineer POST request's body generation

I'm trying to scrape reviews from Google Play. Google Play loads reviews dynamically after page has been scrolled to the end. I intercepted post requests that browser sends for retrieving reviews and noticed that the only thing that changes per request is the request's body. What I'm struggling to understand is how the request's body is generated.
The first request's body looked like this:
f.req: [[["UsvDTd","[null,null,[2,null,[40,null,\"CpUBCpIBKm0KOfc7ms0D_z7jKJielp7Fz8_Pz8_Pms3OzpuZyJvMnMXOxYmSxc3MyczPz8vIycjMysbHxszPysb__hAoITbZQaENmbWoMU2VCwWZPGwZOdccwQD8MmXEUABaCwlwT4zmNQBa2BADYMm1lu0EMiEKHwodYW5kcm9pZF9oZWxwZnVsbmVzc19xc2NvcmVfdjI\"],null,[]],[\"com.feelingtouch.zf3d\",7]]",null,"generic"]]]
and this's is the second request:
f.req: [[["UsvDTd","[null,null,[2,null,[40,null,\"CpUBCpIBKm0KOfc7msyg_28-Rpielp7Fz8_Pz8_Pm56eypyZzcycm8XOxYmSxc3MyczPz8vIycjMysbHxszPysb__hB4ITbZQaENmbWoMZI5V7V-7g3BObnBkABfM2XEUABaCwli2aizD1W9ExADYMm1lu0EMiEKHwodYW5kcm9pZF9oZWxwZnVsbmVzc19xc2NvcmVfdjI\"],null,[]],[\"com.feelingtouch.zf3d\",7]]",null,"generic"]]]
Can I somehow reverse engineer how the request is generated?
I tried to use Selenium, but after scrolling down few dozens time RAM usage runs up and Selenium becomes unresponsive.
The thing that changes is the pagination token. But, there are a couple of other things as well.
Here is the full encoded request body, with the parameters wrapped in #{} (number_of_results, pagination_token, and product_id).
f.req=%5B%5B%5B%22UsvDTd%22%2C%22%5Bnull%2Cnull%2C%5B2%2Cnull%2C%5B#{number_of_results}%2Cnull%2C#{pagination_token}%5D%2Cnull%2C%5B%5D%5D%2C%5B%5C%22#{product_id}%5C%22%2C7%5D%5D%22%2Cnull%2C%22generic%22%5D%5D%5D
So each time you scroll the page the pagination_token would change. They use it to retrieve the next page results.
You don't need to reverse engineer the token itself. You can find the first one when inspecting the page source, and then, for each next time you make a request to retrieve the results, the next_page_toke will be included in there. So, you just keep replacing the token until you reach the last page, and retrieve all the reviews.
Alternatively, you could use a third-party solution like SerpApi. We handle proxies, solve captchas, and parse all rich structured data for you.
Example python code for retrieving YouTube reviews (available in other libraries also):
from serpapi import GoogleSearch
params = {
"api_key": "SECRET_API_KEY",
"engine": "google_play_product",
"store": "apps",
"gl": "us",
"product_id": "com.google.android.youtube",
"all_reviews": "true"
}
search = GoogleSearch(params)
results = search.get_dict()
Example JSON output:
"reviews": [
{
"title": "Qwerty Jones",
"avatar": "https://play-lh.googleusercontent.com/a/AATXAJwSQC_a0OIQqkAkzuw8nAxt4vrVBgvkmwoSiEZ3=mo",
"rating": 3,
"snippet": "Overall a great app. Lots of videos to see, look at shorts, learn hacks, etc. However, every time I want to go on the app, it says I need to update the game and that it's \"not the current version\". I've done it about 3 times now, and it's starting to get ridiculous. It could just be my device, but try to update me if you have any clue how to fix this. Thanks :)",
"likes": 586,
"date": "November 26, 2021"
},
{
"title": "matthew baxter",
"avatar": "https://play-lh.googleusercontent.com/a/AATXAJy9NbOSrGscHXhJu8wmwBvR4iD-BiApImKfD2RN=mo",
"rating": 1,
"snippet": "App is broken, every video shows no dislikes even after I hit the button. I've tested this with multiple videos and now my recommended is all messed up because of it. The ads are longer than the videos that I'm trying to watch and there is always a second ad after the first one. This app seriously sucks. I would not recommend this app to anyone.",
"likes": 352,
"date": "November 28, 2021"
},
{
"title": "Operation Blackout",
"avatar": "https://play-lh.googleusercontent.com/a-/AOh14GjMRxVZafTAmwYA5xtamcfQbp0-rUWFRx_JzQML",
"rating": 2,
"snippet": "YouTube used to be great, but now theyve made questionable and arguably stupid decisions that have effectively ruined the platform. For instance, you now have the grand chance of getting 30 seconds of unskipable ad time before the start of a video (or even in the middle of it)! This happens so frequently that its actually a feasible option to buy an ad blocker just for YouTube itself... In correlation with this, YouTube is so sensitive twords the public they decided to remove dislikes. Why????",
"likes": 370,
"date": "November 24, 2021"
},
...
],
"serpapi_pagination": {
"next": "https://serpapi.com/search.json?all_reviews=true&engine=google_play_product&gl=us&hl=en&next_page_token=CpEBCo4BKmgKR_8AwEEujFG0VLQA___-9zuazVT_jmsbmJ6WnsXPz8_Pz8_PxsfJx5vJns3Gxc7FiZLFxsrLysnHx8rIx87Mx8nNzsnLyv_-ECghlTCOpBLShpdQAFoLCZiJujt_EovhEANgmOjCATIiCiAKHmFuZHJvaWRfaGVscGZ1bG5lc3NfcXNjb3JlX3YyYQ&product_id=com.google.android.youtube&store=apps",
"next_page_token": "CpEBCo4BKmgKR_8AwEEujFG0VLQA___-9zuazVT_jmsbmJ6WnsXPz8_Pz8_PxsfJx5vJns3Gxc7FiZLFxsrLysnHx8rIx87Mx8nNzsnLyv_-ECghlTCOpBLShpdQAFoLCZiJujt_EovhEANgmOjCATIiCiAKHmFuZHJvaWRfaGVscGZ1bG5lc3NfcXNjb3JlX3YyYQ"
}
Check out the documentation for more details.
Test the search live on the playground.
Disclaimer: I work at SerpApi.

No large images in shares posted using LinkedIn API

During last couple of weeks any shares made using LinkedIn sharing API don't display large images, though we provide all required information for this, including image URL. The same happens when we use REST Console. Below you can see a sample request and how the share looks like.
{
"comment": "How Triggre achieves its simplicity",
"content": {
"title": "Triggre / Blog / Design Philosophy - Part 3",
"description": "In the previous two posts about our design philosophy you could read how we decided to build Triggre and why we chose simplicity as the core of our desi...",
"submitted-url": "https://www.triggre.com/en/blog/the-triggre-design-philosophy-part-3/",
"submitted-image-url": "https://www.triggre.com/media/1105/sagrada-familia.jpg?width=800"
},
"visibility": {
"code": "anyone"
}
}
A share without large image
What is happening and how could we workaround it?

To get more than 5 reviews from google places API

I am doing an application where I extract the google reviews using google places API.When I read the document related to it in "https://developers.google.com/maps/documentation/javascript/places",I found out that I could get only 5 top reviews.Is there any option to get more reviews.
In order to have access to more than 5 reviews with the Google API you have to purchase Premium data Access from Google.
That premium plan will grant you access to all sorts of additional data points you have to shell out a pretty penny.
If you are a Business owner wanting to retrieve all of your reviews, you can do so but first you have to get verified and could do this through the MyBusiness API more info here: https://developers.google.com/my-business/
There is a feature request for that: Issue 7630: Response to Include More Than 5 Reviews ─ I'd recommend you "star" it to receive updates.
Unfortunately there's no way to get more than 5 reviews in the places API unless you are the business owner after getting verified as Tekill said.
But it looks like there are some external services that can get all the reviews. My guess is that they scrape them from Google Maps directly:
Some of these services are Wextractor, ReviewShake and AllReviews
Alternatively, you can use a third party solution like SerpApi to scrape all the reviews of any place. It's a paid API with a free trial.
Each page fatches 10 results. To implement the pagination just use the start parameter which defines the result offset (e.g., 0 (default) is the first page of results, 10 is the 2nd page of results, 20 is the 3rd page of results, etc.)
Example python code (available in other libraries also):
from serpapi import GoogleSearch
params = {
"engine": "google_maps_reviews",
"place_id": "0x89c259a61c75684f:0x79d31adb123348d2",
"api_key": "SECRET_API_KEY"
}
search = GoogleSearch(params)
results = search.get_dict()
reviews = results['reviews']
Example output:
"reviews": [
{
"user": {
"name": "Waylon Bilbrey",
"link": "https://www.google.com/maps/contrib/107691056156160235121?hl=en-US&sa=X&ved=2ahUKEwiUituIlpTvAhVYCc0KHbvTCrgQvvQBegQIARAx",
"thumbnail": "https://lh3.googleusercontent.com/a-/AOh14GjOj6Wjfk1kSYjhvH7WIBNMdl4nPj6FvUhvYcR6=s40-c0x00000000-cc-rp",
"reviews": 1
},
"rating": 4,
"date": "a week ago",
"snippet": "I've been here multiple times. The coffee itself is just average to me. The service is good (the people working are nice). The aesthetic is obviously what brings the place some fame. A little overpriced (even for NY). A very small cup for $6 where I feel like the price comes from the top rainbow foam decor , when I'm going to cover it anyways. If it's for an insta pic then it may be worth it?"
},
{
"user": {
"name": "Amber Grace Sale",
"link": "https://www.google.com/maps/contrib/106390058588469541899?hl=en-US&sa=X&ved=2ahUKEwiUituIlpTvAhVYCc0KHbvTCrgQvvQBegQIARA7",
"thumbnail": "https://lh3.googleusercontent.com/a-/AOh14Gj84nHu_9V_0V4yRbZcr-8ZTYAHua6gUBP8fC7W=s40-c0x00000000-cc-rp-ba3",
"local_guide": true,
"reviews": 33,
"photos": 17
},
"rating": 5,
"date": "2 years ago",
"snippet": "They really take pride in their espresso roast here and the staff is extremely knowledgeable on the subject. It’s also a GREAT place to do work although a table is no guarantee; you might have to wait for a bit. My almond milk cappuccino was very acidic at the end which wasn’t expected but I could still tell the bean was high quality. Their larger lattés they put in a tall glass cup which looks really really cool. Would definitely go again.",
"likes": 2,
"images": [
"https://lh5.googleusercontent.com/p/AF1QipMup24_dHrWtNN4ZD70EPsiRMf_tykcUkPw6A1H=w100-h100-p-n-k-no"
]
},
{
"user": {
"name": "Kelvin Petar",
"link": "https://www.google.com/maps/contrib/100859090874785206875?hl=en-US&sa=X&ved=2ahUKEwiUituIlpTvAhVYCc0KHbvTCrgQvvQBegQIARBG",
"thumbnail": "https://lh3.googleusercontent.com/a-/AOh14GhdIvUDamzfPqbYIpwhnGJV2XWSi77iVXfEsiKS=s40-c0x00000000-cc-rp",
"reviews": 3
},
"rating": 4,
"date": "3 months ago",
"snippet": "Stumptown Cafe is the perfect place to work or catch up with friends. Never too loud, never too dead. Their lattes and deliciously addicting and the toasts are tasty as well. Wifi is always fast, which is a huge plus! The staff are the friendliest, I highly recommend this place!"
},
...
]
You can check out the documentation for more details.
Disclaimer: I work at SerpApi.
Adding to the answer of #miguev, there's at the moment no way to get more than 5 top reviews without using premium APIs (according to a Google Maps guy I had a talk with) and that's pricey.
We tried to sign for The Google Maps Platform Premium Plan to show them on pages like this but Google said it's no longer available for sign up or new customers. Right now we're limited to 5 reviews only.

Webmaster tool validation for JSON + ld structured data

I am using Structured Data Markup for a product page. And using json + ld and have put the script tags in the header section.
The script section is marked by two lines. And when I validate the JSON segment in http://json-ld.org/playground/index.html it gives a proper output. But when I validate it with Google webmaster tools it does not show me the content.
http://www.google.com/webmasters/tools/richsnippets
Kindly help me in identifying any issue with this. I think I have made this correctly.
For More Info , I am posting the JSON segment with this.
<script type="application/ld+json"> {
"#context": "http://schema.org",
"#type": "Product",
"productID": "mpn:20AV002YMB",
"description": "Whether you re an overachiever or an up-and-comer, ThinkPad laptops are the tools you need. From the thin-and-light wonder to the heavy-duty mobile workstation, they re built, tested and enhanced for reliability, durability and speed. Enjoy superior web conferencing on your L540, which includes a low-light sensitive 720p HD webcam with wide-angle viewing.",
"url": "http://www.abc.lk/lenovo/business-notebook/thinkpad-l540-core-i5-4200m-4gb-500gb-15-6in-1080p-dvdrw-win7-pro-win8-pro-azb/20av002ymb/product-details/m852r253.aspx",
"name": "ThinkPad L540 Core i5-4200m / 4GB 500GB 15.6in 1080p Dvdrw Win7 Pro / Win8 Pro Azb",
"image": "https://www.abc.lk/ProductImages/images/M852R253.jpg",
"model": "ThinkPad L540",
"manufacturer": {
"name": "Lenovo",
"url": "http://www.abc.lk/en/lenovo"
},
"offers": {
"#type": "Offer",
"availability": "http://schema.org/OutOfStock",
"price": "706.97",
"priceCurrency": "EUR"
}
}
</script>
Thank You
Google Webmaster Tools does not support JSON-LD (at time of writing) However Google Search does.
You can see this thread for more information:
https://productforums.google.com/forum/#!topic/webmasters/9lzzOw2W-dk
and use Google's beta markup tester for email:
https://www.google.com/webmasters/markup-tester/ (which will also work for web HTML) - Your JSON validates here
I know this is an old thread, but I thought it might be useful for anyone searching for answers to this who lands here.
Now Google has created a utility that will detect RDFa and JSON-LD (maybe Microdata too, Haven't tried.)
https://developers.google.com/webmasters/structured-data/testing-tool/

Resources