Avoid unboxing for vectors with 1 value

Avoid unboxing for vectors with 1 value - r

For an API I wish to push data to, I need to avoid unboxing to happen on specific values.
Consider the following input:
library(jsonlite)
lsA <- list(propertyName = "listA",
Values = c("x"))
lsB <- list(propertyName = "listB",
Values = c("a","b","c"))
lsC <- list(propertyName = "listC",
min = 1,
max = 3)
I want my output to be like this:
[
{
"propertyName": "listA",
"Values": ["x"]
},
{
"propertyName": "listB",
"Values": ["a", "b", "c"]
},
{
"propertyName": "listC",
"min": 1,
"max": 3
}
]
However, when I do this:
lsTest <- list()
lsTest <- list.append(lsTest,I(lsA),lsB,lsC)
jsonTest <- jsonlite::toJSON(lsTest,auto_unbox = TRUE, pretty = TRUE)
jsonTest
I'm getting this (notice the unboxed value for listA):
[
{
"propertyName": "listA",
"Values": "x"
},
{
"propertyName": "listB",
"Values": ["a", "b", "c"]
},
{
"propertyName": "listC",
"min": 1,
"max": 3
}
]
How can I avoid specific one-element vectors to be unboxed during the toJSON conversion?
EDIT: cwthom kindly resolved it. Just change c("x") to list("x"). It works for lists with multiple items as well, and only add some additional new lines, which appear to be cosmetics only and did not have any negative impact on the end result on my end.

Related

How can I spread array elements in parent object by jq?

I want to transform following type of json obj
[
{
"A": "a",
"Tags": [
{ "key":"x", "value":0},
{ "key":"y", "value":1},
]
},
{...}
]
to this, including Tags list on top
[
{
"A": "a",
"x": 0,
"y": 1
},
{...}
]
I try to use JQ but without result.

map(. + (.Tags | from_entries) | del(.Tags))
Will map() over all the objects in the array and:
Convert .Tags to an object using from_entries
This is added to the original object (. + ())
Delete the original .Tags
Output:
[
{
"A": "a",
"x": 0,
"y": 1
}
]
Online Demo

using toJSON: How to get an array nested in a data.frame into a JSON object

I am trying to convert a data.frame into JSON text. How can I include an array in this data.frame? I.e. the variable characteristics needs to contain multiple elements. I do not succeed in this.
Here is my code:
data <- data.frame(sequenceNbr = c(1:3),
requestTypeCode = "MR001",
managementStartingDate = format(Sys.Date(),"%Y-%m-%d"),
TopicCode = "TH02",
Nbr = c("1760448", "6580364", "1391363"),
TypeCode = "R003008",
char1 = "CK001001", char2 = "CK001002",
stringsAsFactors = FALSE) %>%
group_by(sequenceNbr) %>%
nest( Assignment=c(TopicCode),
entity = c(Nbr),
ToControl = c(TypeCode),
characteristics = c(char1,char2)
)
data <- data %>%
mutate( Assignment = purrr::map(Assignment, as.list),
entity = purrr::map(entity, as.list))
json <- jsonlite::toJSON(list(`DefList`=data),
auto_unbox = TRUE, pretty = TRUE)
This gives as output
{
"DefList": [
{
"sequenceNbr": 1,
"requestTypeCode": "MR001",
"managementStartingDate": "2020-06-16",
"Assignment": {
"TopicCode": "TH02"
},
"entity": {
"Nbr": "1760448"
},
"ToControl": [
{
"TypeCode": "R003008"
}
],
"characteristics": [
{
"char1": "CK001001",
"char2": "CK001002"
}
]
},
{
"sequenceNbr": 2,
"requestTypeCode": "MR001",
"managementStartingDate": "2020-06-16",
"Assignment": {
"TopicCode": "TH02"
},
"entity": {
"Nbr": "6580364"
},
"ToControl": [
{
"TypeCode": "R003008"
}
],
"characteristics": [
{
"char1": "CK001001",
"char2": "CK001002"
}
]
}
]
}
What I want as output is this (there is a difference in the characteristics-element):
{
"DefList": [
{
"sequenceNbr": 1,
"requestTypeCode": "MR001",
"managementStartingDate": "2020-06-16",
"Assignment": {
"TopicCode": "TH02"
},
"entity": {
"Nbr": "1760448"
},
"ToControl": [
{
"TypeCode": "R003008"
}
],
"characteristics": [
"CK001001",
"CK001002"
]
},
{
"sequenceNbr": 2,
"requestTypeCode": "MR001",
"managementStartingDate": "2020-06-16",
"Assignment": {
"TopicCode": "TH02"
},
"entity": {
"Nbr": "6580364"
},
"ToControl": [
{
"TypeCode": "R003008"
}
],
"characteristics": [
"CK001001",
"CK001002"
]
}
]
}
It should also work when characteristics contains only one value. Any idea's? I think the data.frame structure is the limiting factor, since it cannot contain an array. I see in this post : https://blog.exploratory.io/working-with-json-data-in-very-simple-way-ad7ebcc0bb89 that it is correct JSON syntax, but I do not manage to obtain it. Thanks in advance for your help.

toJSON without outer square brackets

I am converting a nested list of a specific structure (required by an API) to JSON using toJSON() in jsonlite. However, I need the final JSON to not contain the outer square brackets (also required by an API).
test_list <- list(
list(formName = "test_title", user_id = "test_userid", rows = list(list(row = 0))),
list(formName = "test_title2", user_id = "test_userid2", rows = list(list(row = 0)))
)
jsonlite::toJSON(test_list, pretty = TRUE, auto_unbox = TRUE)
Which gives:
[
{
"formName": "test_title",
"user_id": "test_userid",
"rows": [
{
"row": 0
}
]
},
{
"formName": "test_title2",
"user_id": "test_userid2",
"rows": [
{
"row": 0
}
]
}
]
But I need to remove the first and last square bracket.
I can use purrr::flatten() to remove the top level of the list and thus the square brackets in the JSON, but then toJSON() doesn't seem to like that the list has duplicate names, and renames them to name.1, name.2, name.3 etc (which also isn't allowed by the API).
That is:
jsonlite::toJSON(test_list %>% purrr::flatten(), pretty = TRUE, auto_unbox = TRUE)
Which removes the outer square brackets, but converts the names in the second element to formName.1, user_id.1, rows.1, like so:
{
"formName": "test_title",
"user_id": "test_userid",
"rows": [
{
"row": 0
}
],
"formName.1": "test_title2",
"user_id.1": "test_userid2",
"rows.1": [
{
"row": 0
}
]
}
But this is what I need:
{
"formName": "test_title",
"user_id": "test_userid",
"rows": [
{
"row": 0
}
],
"formName": "test_title2",
"user_id": "test_userid2",
"rows": [
{
"row": 0
}
]
}
That is, formName, user_ud and rows in the second JSON element are not appended with ".1"
Any advice would be greatly appreciated!

This seems to work (ugly but simple):
df = data.frame(a=1,b=2)
js = toJSON(unbox(fromJSON(toJSON(df))))
> js
{"a":1,"b":2}
for some reason, auto_unbox=T does not work:
> toJSON(df,auto_unbox = T)
[{"a":1,"b":2}]

Just edit the JSON. You can do it with gsub, or with stringr. If you use stringr functions, it will lose it's "json" class, but you can put it back:
> x = jsonlite::toJSON(test_list %>% purrr::flatten(), pretty = TRUE, auto_unbox = TRUE)
> gsub("user_id\\.1", "user_id", x)
{
"formName": "test_title",
"user_id": "test_userid",
"rows": [
{
"row": 0
}
],
"formName.1": "test_title2",
"user_id": "test_userid2",
"rows.1": [
{
"row": 0
}
]
}
> y = stringr::str_replace_all(x, "user_id\\.1", "user_id")
> class(y) = "json"
> y
{
"formName": "test_title",
"user_id": "test_userid",
"rows": [
{
"row": 0
}
],
"formName.1": "test_title2",
"user_id": "test_userid2",
"rows.1": [
{
"row": 0
}
]
}
I'll leave it to you to write appropriate regex to make the substitutions you want.

Elastic : make a light count query (vs search query)

I am accessing bulk data in elastic through R. For analytics purpose I need to query for data for a relatively long duration (say a month). The data for a month is approx 4.5 million rows and R goes out of memory.
Sample data is below (for 1 day):
dt <- as.Date("2015-09-01", "%Y-%m-%d")
frmdt <- strftime(dt,"%Y-%m-%d")
todt <- as.Date(dt+1)
todt <- strftime(todt,"%Y-%m-%d")
connect(es_base="http://xx.yy.zzz.kk")
start_date <- as.integer(as.POSIXct(frmdt))*1000
end_date <- as.integer(as.POSIXct(todt))*1000
query <- sprintf('{"query":{"range":{"time":{"gte":"%s","lte":"%s"}}}}',start_date,end_date)
s_list <- elastic::Search(index = "organised_2015_09",type = "PROPERTY_SEARCH", body=query ,
fields = c("trackId", "time"), size=1000000)$hits$hits
length(s_list)
[1] 144612
This result for 1 day has 144k records and is 222 MB. Sample list item below:
> s_list[[1]]
$`_index`
[1] "organised_2015_09"
$`_type`
[1] "PROPERTY_SEARCH"
$`_id`
[1] "1441122918941"
$`_version`
[1] 1
$`_score`
[1] 1
$fields
$fields$time
$fields$time[[1]]
[1] 1441122918941
$fields$trackId
$fields$trackId[[1]]
[1] "fd4b4ce88101e58623ba9e6e31971d1f"
Actually a summary count of number of items by "trackId" and "time" (summarize for every day) would suffice for analytics purpose. Hence I tried to transform this into a count query with aggregations. So I constructed the below query:
query < -'{"size" : 0,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"range": {
"time": {
"gte": 1441045800000,
"lte": 1443551400000
}
}
}
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "time",
"interval": "day",
"time_zone": "+05:30"
},
"aggs": {
"group_by_state": {
"terms": {
"field": "trackId",
"size": 0
}
}
}
}
}
}'
response <- elastic::Search(index="organised_recent",type="PROPERTY_SEARCH",body=query, search_type="count")
However I did not gain in speed or document size. i think I am missing something but not sure what.

quoting unquoted members in nested list

Using R, I generate a list that contains certain unquoted elements. Please see at the bottom- it is invalid javascript code.
R code (does not work)
outq <- lapply (out, function (el){
el <- if( is.factor(el$ann) ){
el$ann <- apply(el$ann, 1, function(e){ e <- paste('"', e, '"', sep="") })
}
})
In the R language, How can I quote the members of the list$x$ann factor?
When I try to parse this JSON, json2.js fails.
Sample Data (Invalid JSON)
results = JSON.parse(
{
"result": {
"tot": {
"molal": [ 0.00071243, 0.00071243, 4, 4 ],
"ann": [ , , , ]
},
"desc": {
"val": [ 8.3486, 4, 0.8531, 4.0025, 0.99999, 0.00072541, 0.00071243, 100, -1.2983e-05, -0.00016223, 17, 111.02, 55.511 ],
"ann": [ Charge balance, Adjusted to redox eq, , , , , , , , , , , ]
},
"species": {
"molal": [ 55.508, 0.00029101, 2.3071e-09, 0.00042017, 0.00028731, 4.4532e-06, 4.9292e-07, 0.00069149, 1.0274e-05, 6.2142e-06, 4.9139e-12, 4, 0, 4.1166e-27, 4, 8.5144e-21 ],
"act": [ 0.8531, 0.00010921, 4.4812e-09, 1.4857e-06, 7.7889e-05, 4.4532e-06, 9.6777e-07, 0.00024834, 3.3916e-06, 0.00028204, 4.9139e-12, 2.2702, 0, 4.1166e-27, 3.7925, 1.8453e-20 ]
},
"master": {
"molal": [ 0.00071243, 0.00071243, 4, 8.2332e-27, 4, 1.7029e-20 ]
},
"pphases": {
"moles": 9.9993,
"delta": -0.00071243
},
"ListInfo": {
"n": 1,
"format": true
}
}
}
);

Is there a reason you cannot use the RJSON / RJSONIO packages for R ?

I have advanced a step forward using this R code:
outq <- lapply (out, function (el){el <- if( is.factor(el$ann) ){ el$ann <- lapply(el$ann, function(e){ e <- paste('"', e, '"', sep="") })} else {el}})

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Avoid unboxing for vectors with 1 value - r

Related

How can I spread array elements in parent object by jq?

using toJSON: How to get an array nested in a data.frame into a JSON object

toJSON without outer square brackets

Elastic : make a light count query (vs search query)

quoting unquoted members in nested list

Categories

Resources