I am trying to request data using the httr package. Following this format:
args <- list(metrics = c(list(name = "Jobs.2018",as = "Jobs 2018")),
constraints = list(dimensionName ="Area",
map = list("Latah County ID" = c(16057))))
test <- POST(url =
"https://agnitio.emsicloud.com/emsi.us.demographics/2018.3",
add_headers(`authorization` = paste("bearer",token)),
add_headers(`content-type` ="application/json"),
body = toJSON(args,auto_unbox = TRUE),
verbose())
I keep getting a 400 Bad Request error with everything I have looked up and tried. Do I need to add something to the arguments that I am just not finding?
P.S. I am sorry that this isn't a repeatable example
We'll assume (a bad thing but necessary for an answer) that you obtained token by issuing a prior POST request as indicated on the linked API page and then properly decoded the JSON web token into token.
If you did that properly, then one next likely possibility is malformed body data in POST request.
When I look at a sample API call:
curl --request POST \
--url https://agnitio.emsicloud.com/emsi.us.industry/2018.3 \
--header 'authorization: bearer <access_token>' \
--header 'content-type: application/json' \
--data '{ "metrics": [ { "name": "Jobs.2017", "as":"2017 Jobs" }, { "name": "Establishments.2017" } ], "constraints": [ { "dimensionName": "Area", "map": { "Latah County, ID": ["16057"] } }, { "dimensionName": "Industry", "map": { "Full Service Restaurant s": ["722511"] } } ] }'
that sample JSON looks like this pretty-printed:
{
"metrics": [
{
"name": "Jobs.2017",
"as": "2017 Jobs"
},
{
"name": "Establishments.2017"
}
],
"constraints": [
{
"dimensionName": "Area",
"map": {
"Latah County, ID": [
"16057"
]
}
},
{
"dimensionName": "Industry",
"map": {
"Full Service Restaurants": [
"722511"
]
}
}
]
}
Yours looks like:
{
"metrics": {
"name": "Jobs.2018",
"as": "Jobs 2018"
},
"constraints": {
"dimensionName": "Area",
"map": {
"Latah County ID": 16057
}
}
}
when it needs to look more like this:
{
"metrics": [
{
"name": "Jobs.2018",
"as": "Jobs 2018"
}
],
"constraints": [
{
"dimensionName": "Area",
"map": {
"Latah County ID": [
"16057"
]
}
}
]
}
To do that, we need to use this list structure:
list(
metrics = list(
list(
name = jsonlite::unbox("Jobs.2018"),
as = jsonlite::unbox("Jobs 2018")
)),
constraints = list(list(
dimensionName = jsonlite::unbox("Area"),
map = list("Latah County ID" = c("16057"))
))
) -> args
Note especially that the API expects that map ID JSON data element to be character and not integer/numeric.
Now, we can make the POST request like this (spaced out for answer readability as it has embedded comments):
httr::POST(
url = "https://agnitio.emsicloud.com/emsi.us.demographics/2018.3",
httr::add_headers(
`authorization` = sprintf("bearer %s", token)
),
encode = "json", # this lets' httr do the work for you
httr::content_type_json(), # easier than making a header yourself
body = args,
httr::verbose()
) -> res
That should work but b/c it's a closed API without free registration I cannot test it.
Related
I would like to include synonyms in Elasticsearch using the R package elastic, preferably at search time only. I can't get this working. Hope someone can help me out. Thanks!
Here I give one example assuming that brain, mind, and smart are synonyms.
My code in R...
library(elastic)
connection <- connect()
#index_delete(connection,"test")
index_create(connection, "test")
properties <-
'{
"properties": {
"sentence": {
"type": "text",
"position_increment_gap": 100
}
}
}'
mapping_create(connection, "test", body = properties)
sentences <- data.frame(sentence = c("This is a brain","This a a mind","This is fun","This is smart"))
document <- cbind(1,sentences)
colnames(document)[1] <- "document"
docs_bulk(connection,document,"test")
emptyBody <-
'{
"query": {
"match_phrase": {
"sentence": {
"query": "this mind",
"slop": 100
}
}
}
}'
Search(connection,"test",body=emptyBody)
... returns...
"This a mind"
But I want...
"This is a brain"
"This is a mind"
"This is smart"
Settings?...
Based on the documentations of the R package elastic and some general searches, I experimented with the following code block, putting it before the 'properties' code block, but that did not have any effect. :(
settings <- '{
"analysis": {
"analyzer": {
"synonym_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "synonym_filter"]
}
},
"filter": {
"synonym_filter": {
"type": "synonym_graph",
"synonyms": [
"brain, mind, smart"
]
}
}
}
}
}'
index_analyze(connection, "test", body = settings)
Are you using the synonyms analyzer in the mapping field?
"mappings": {
"properties": {
"name": {
"type": "text",
"search_analyzer": "synonym_analyzer"
}
}
}
I found the solution
I had to create the index with particular settings (instead of using the index_analyze function.
settings <- '
{
"settings": {
"index": {
"analysis": {
"filter": {
"my_graph_synonyms": {
"type": "synonym_graph",
"synonyms": [
"mind, brain",
"brain storm, brainstorm, envisage"
]
}
},
"analyzer": {
"my_index_time_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"stemmer"
]
},
"my_search_time_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"stemmer",
"my_graph_synonyms"
]
}
}
}
}
},
"mappings": {
"properties": {
"sentence": {
"type": "text",
"analyzer": "my_index_time_analyzer",
"search_analyzer": "my_search_time_analyzer"
}
}
}
}'
index_create(connection, "test", body = settings)
Using the example shared by Alexander Marquardt.
I need to create records in an airtable base and have the following code in scrapy:
url = "https://api.airtable.com/v0/appuhKmlhLIIEszLm/Table%201"
payload = json.dumps({
"records": [
{
"fields": {
"Name": "Temporada 6"
}
},
{
"fields": {
"Name": "Temporada 2"
}
}
]
})
headers = {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
}
yield scrapy.Request(method = "POST",url = url, headers=headers, body=payload)
I checked the code with python requests and works, however, when I use this scrapy code, no info is uploaded, do you know why is this?
I am new to jq and stuck with this problem for a while. Any help is appreciable.
I have two json files,
In file1.json:
{
"version": 4,
"group1": [
{
"name":"olditem1",
"content": "old content"
}
],
"group2": [
{
"name":"olditem2"
}
]
}
And in file2.json:
{
"group1": [
{
"name" : "newitem1"
},
{
"name":"olditem1",
"content": "new content"
}
],
"group2": [
{
"name" : "newitem2"
}
]
}
Expected result is:
{
"version": 4,
"group1": [
{
"name":"olditem1",
"content": "old content"
},
{
"name" : "newitem1"
}
],
"group2": [
{
"name":"olditem2"
},
{
"name" : "newitem2"
}
]
}
Criterial for merge:
Has to merge only group1 and group2
Match only by name
I have tried
jq -S '.group1+=.group1|.group1|unique_by(.name)' file1.json file2.json
but this is filtering group1 and all other info are lost.
This approach uses INDEX to create a dictionary of unique elements based on their .name field, reduce to iterate over the group fields to be considered, and an initial state created by combining the slurped (-s) input files using add after removing the group fileds to be processed separately using del.
jq -s '
[ "group1", "group2" ] as $gs | . as $in | reduce $gs[] as $g (
map(del(.[$gs[]])) | add; .[$g] = [INDEX($in[][$g][]; .name)[]]
)
' file1.json file2.json
{
"version": 4,
"group1": [
{
"name": "olditem1",
"content": "new content"
},
{
"name": "newitem1"
}
],
"group2": [
{
"name": "olditem2"
},
{
"name": "newitem2"
}
]
}
Demo
I'm trying to get the repo names under a Project using the Bitbucket API. The current link on the documentation says to use
curl -u username:pwd http://${bitbucket-url}/rest/api/1.0/projects/${projectkey}/repos/
Response:
{
"size": 1,
"limit": 25,
"isLastPage": true,
"values": [
{
"slug": "my-repo",
"id": 1,
"name": "My repo",
"scmId": "git",
"state": "AVAILABLE",
"statusMessage": "Available",
"forkable": true,
"project": {
"key": "PRJ",
"id": 1,
"name": "My Cool Project",
"description": "The description for my cool project.",
"public": true,
"type": "NORMAL",
"links": {
"self": [
{
"href": "http://link/to/project"
}
]
}
},
"public": true,
"links": {
"clone": [
{
"href": "ssh://git#/PRJ/my-repo.git",
"name": "ssh"
},
{
"href": "https:///scm/PRJ/my-repo.git",
"name": "http"
}
],
"self": [
{
"href": "http://link/to/repository"
}
]
}
}
],
"start": 0
}
But I only need the repo name from the response
from subprocess import call
import configparser
import subprocess
import json
import os
base_dir = os.getcwd()
DETACHED_PROCESS = 0x00000008
cmd = 'curl --url "' + bb_url + '?pagelen=100&page=' + str(page) + '" --user ' + bb_user + ':' + bb_pwd + ' --request GET --header "Accept: application/json"'
output = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, creationflags=DETACHED_PROCESS).communicate()
datastore = json.loads(output[0].decode("utf-8"))
size = datastore.get("size")
values = datastore.get("values")
if(len(values)) == 0:
break
for repos in range(size):
repo_name = values[repos]["values"]["slug"]
f_initial = open (base_dir+"\\repositoryList.txt", "a+")
f_initial.write(repo_name)
f_initial.write("\n")
f_initial.close()
page = page + 1
This script will help you get the list of all the repositories in your project and write it under the file repositoryList.txt
With bash command
repoNamesJson=$(curl -D- -X GET -H "Authorization: Basic <encoded user pasword here>" -H "Content-Type: application/json" https://yourstash/rest/api/1.0/projects/ad/repos?limit=100000)
repoNames=$(echo $repoNamesJson | awk -v RS=',' '/{"slug":/ {print}' | sed -e 's/{"slug":/''/g' | sed -e 's/"/''/g')
echo $repoNames
With python-stash library
import stashy
bitbucket = stashy.connect("host", "username", "password")
projects = bitbucket.projects.list()
repos = bitbucket.repos.list()
for project in projects:
for repo in bitbucket.projects["%s" % (project["key"])].repos.list():
print(repo["name"])
print(repo["project"]['key'])
You can use BitBucket API partial responses in order to limit the fields returned by the API.
Taking excerpts from the doc page:
[...] use the fields query parameter.
The fields parameter supports 3 modes of operation:
Removal of select fields (e.g. -links)
Pulling in additional fields not normally returned by an endpoint, while still getting all the default fields (e.g. +reviewers)
Omitting all fields, except those specified (e.g. owner.display_name)
The fields parameter can contain a list of multiple comma-separated field names (e.g. fields=owner.display_name,uuid,links.self.href). The parameter itself is not repeated.
So in your case would be something like:
curl -u username:pwd
http://${bitbucket-url}/rest/api/1.0/projects/${projectkey}/repos?fields=values.slug
Though I must say that the JSON output is not flat, it will still retain its original structure:
{
"values": [
{
"slug": "your repo slug #1"
},
...
So, if you actually want only a list with each repo slug on its own line, there's still some leg work to do.
I've a json structure as given below:
{"DocumentName":"es","DocumentId":"2","Content": [{"PageNo":1,"Text": "The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."},{"PageNo":2,"Text": "The query string is processed using the same analyzer that was applied to the field during indexing."}]}
I need to get stemmed analyzed result for Content.Text field. For that I've created a mapping while creating index.It is given as below:
curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d"{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "my_stemmer"]
}
},
"filter": {
"my_stemmer": {
"type": "stemmer",
"name": "english"
}
}
}
}
}, {
"mappings": {
"properties": {
"DocumentName": {
"type": "text"
},
"DocumentId": {
"type": "keyword"
},
"Content": {
"properties": {
"PageNo": {
"type": "integer"
},
"Text": "_all": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "my_analyzer"
}
}
}
}
}
}
}"
I checked the analyzer created :
curl -X GET "localhost:9200/myindex/_analyze?pretty" -H "Content-Type: application/json" -d"{\"analyzer\":\"my_analyzer\",\"text\":\"indexing\"}"
and it gave the result:
{
"tokens" : [
{
"token" : "index",
"start_offset" : 0,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 0
}
]
}
But after uploading the json into the index, when I tried searching "index" it is returning 0 results.
res = requests.get('http://localhost:9200')
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res= es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})
Any help would be much appreciated.Thank You in advance.
Ignore my comment. The stemmer is working. Try the following:
Mapping:
curl -X DELETE "localhost:9200/myindex"
curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d'
{
"settings":{
"analysis":{
"analyzer":{
"english_exact":{
"tokenizer":"standard",
"filter":[
"lowercase"
]
}
}
}
},
"mappings":{
"properties":{
"DocumentName":{
"type":"text"
},
"DocumentId":{
"type":"keyword"
},
"Content":{
"properties":{
"PageNo":{
"type":"integer"
},
"Text":{
"type":"text",
"analyzer":"english",
"fields":{
"exact":{
"type":"text",
"analyzer":"english_exact"
}
}
}
}
}
}
}
}'
Data:
curl -XPOST "localhost:9200/myindex/_doc/1" -H "Content-Type: application/json" -d'
{
"DocumentName":"es",
"DocumentId":"2",
"Content":[
{
"PageNo":1,
"Text":"The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."
},
{
"PageNo":2,
"Text":"The query string is processed using the same analyzer that was applied to the field during indexing."
}
]
}'
Query:
curl -XGET 'localhost:9200/myindex/_search?pretty' -H "Content-Type: application/json" -d '
{
"query":{
"simple_query_string":{
"fields":[
"Content.Text"
],
"query":"index"
}
}
}'
Exactly one document is returned - as expected. I've also tested the following stems, they all worked correctly with the proposed mapping: apply (applied), texts (text), use (using).
Python example:
import requests
from elasticsearch import Elasticsearch
res = requests.get('http://localhost:9200')
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res = es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})
print(res)
Tested on Elasticsearch 7.4.