Why Json.Decode.Pipeline.hardcoded doesn't require a field name? - functional-programming

Take the example from here: https://tech.chefclub.tv/en/how-to-decode-complex-json-with-elm
type alias Record =
{ uid : String
, age : Maybe Int
, version : Float
}
customDecoder : JD.Decoder Record
customDecoder =
decode Record
|> required "uid" JD.string
|> optional "age" (JD.maybe JD.int) Nothing
|> hardcoded 1.0
I'm wondering why field names sometimes appear in the decoder, and sometimes don't?
If the field name is good for readability, why the usage here is not
customDecoder =
decode Record
|> required "uid" JD.string
|> optional "age" (JD.maybe JD.int) Nothing
|> hardcoded "version" 1.0
If the parameter order is enough, then why the usage here is not
customDecoder =
decode Record
|> required JD.string
|> optional (JD.maybe JD.int) Nothing
|> hardcoded 1.0
I'm wondering if there's an inconsistency API design issue in JSON decoder here ...?
Why Json.Decode.Pipeline.hardcoded doesn't require a field name?

Related

r json mongodb query $in operator syntax error due to double quotes?

I'm building a json query to pass to a mongodb database in R.
In one scenario, I have a vector of dates and I want to query the database to return all records which have a date in the relevant field that matches a date in my vector of dates.
The second scenario is the same as the first, but this time I have a vector of character strings (IDs) and need to return all the records with matching IDs.
I understood the correct way to do this in a json query is to use the $in operator, and then put my vector in an array.
However, when I pass the query to my mongodb database, the exportLogId returns NULL. I'm quite sure that the problem is something to do with how I am representing the $in operator in the final query, since I have very similarly structured queries without the $in operator and they are all working. If I look for just one of my target dates or character strings, I get the desired result.
I followed the mongodb manual here to construct my query, and the only issue I can see is that the $in operator in the output of jsonlite::toJSON() is enclosed in double quotes; whereas I think it might need to be in single quotes (or no quotes at all, but I don't know how to write the syntax for that).
I'm creating my query in two steps:
Create the query as a series of nested lists
Convert the list object to json with jsonlite::toJSON()
Here is my code:
# Load libraries:
library(jsonlite)
# Create list of example dates to query in mongodb format:
sampledates <- c("2022-08-11T00:00:00.000Z",
"2022-08-15T00:00:00.000Z",
"2022-08-16T00:00:00.000Z",
"2022-08-17T00:00:00.000Z",
"2022-08-19T00:00:00.000Z")
# Create query as a list object:
query_list_l <- list(filter =
# Add where clause:
list(where =
# Filter results by list of sample dates:
list(dateSampleTaken = list('$in' = sampledates),
# Define format of column names and values:
useDbColumns = "true",
dontTranslateValues = "true",
jsonReplaceUndefinedWithNull = "true"),
# Define columns to return:
fields = c("id",
"updatedAt",
"person.visualId",
"labName",
"sampleIdentifier",
"dateSampleTaken",
"sequence.hasSequence")))
# Convert list object to JSON:
query_json = jsonlite::toJSON(x = query_list_l,
pretty = TRUE,
auto_unbox = TRUE)
The JSON query now looks like this:
> query_json
{
"filter": {
"where": {
"dateSampleTaken": {
"$in": ["2022-08-11T00:00:00.000Z", "2022-08-15T00:00:00.000Z", "2022-08-16T00:00:00.000Z", "2022-08-17T00:00:00.000Z", "2022-08-19T00:00:00.000Z"]
},
"useDbColumns": "true",
"dontTranslateValues": "true",
"jsonReplaceUndefinedWithNull": "true"
},
"fields": ["id", "updatedAt", "person.visualId", "labName", "sampleIdentifier", "dateSampleTaken", "sequence.hasSequence"]
}
}
As you can see, $in is now enclosed in double quotes, even though I put it in single quotes when I created the query as a list object. I have tried replacing with sprintf() but that just adds a lot of backslashes to my query. I also tried:
query_fixed <- gsub(pattern = "\\"\\$\\in\\"",
replacement = "\\'$in\\'",
x = query_json)
... but this fails with an error.
I would be very grateful to know if:
The syntax problem that is preventing $in from working is actually the double quotes?
If double quotes is the problem, how do I replace them with single quotes without messing up the JSON format?
UPDATE:
The issue seems to occur when R is passing the query to the database, but I still can't work out exactly why.
If I try the query out in loopback explorer in the database, it works and using the export log ID produced, I can then fetch the results with httr::GET() in R. Example query results are shown below (sorry for the hashes - the main point is you can see the format of the returned values):
[1] "[{\"_id\":\"e59953b6-a106-4b69-9e25-1c54eef5264a\",\"updatedAt\":\"2022-09-12T20:08:39.554Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0044-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0002\"}},{\"_id\":\"af5cd9cc-4813-4194-b60b-7d130bae47bc\",\"updatedAt\":\"2022-09-12T20:11:07.467Z\",\"dateSampleTaken\":\"2022-08-17T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0061-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0003\"}},{\"_id\":\"b5930079-8d57-43a8-85c0-c95f7e0338d9\",\"updatedAt\":\"2022-09-12T20:13:54.378Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0043-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0004\"}}]"

Invalid type for parameter error when using put_item dynamodb

I want to write data in dataframe to dynamodb table
item = {}
for row in datasource_archived_df_join_repartition.rdd.collect():
item['x'] = row.x
item['y'] = row.y
client.put_item( TableName='tryfail',
Item=item)
but im gettin this error
Invalid type for parameter Item.x, value: 478.2, type: '<'type 'float''>', valid types: '<'type 'dict''>'
Invalid type for parameter Item.y, value: 696- 18C 12, type: '<'type 'unicode''>', valid types: '<'type 'dict''>'
Old question, but it still comes up high in a search and hasn't been answered properly, so here we go.
When putting an item in a DynamoDB table it must be a dictionary in a particular nested form that indicates to the database engine the data type of the value for each attribute. The form looks like below. The way to think of this is that an AttributeValue is not a bare variable value but a combination of that value and its type. For example, an AttributeValue for the AlbumTitle attribute below is the dict {'S': 'Somewhat Famous'} where the 'S' indicates a string type.
response = client.put_item(
TableName='Music',
Item={
'AlbumTitle': { # <-------------- Attribute
'S': 'Somewhat Famous', # <-- Attribute Value with type string ('S')
},
'Artist': {
'S': 'No One You Know',
},
'SongTitle': {
'S': 'Call Me Today',
},
'Year': {
'N': '2021' # <----------- Note that numeric values are supplied as strings
}
}
)
In your case (assuming x and y are numbers) you might want something like this:
for row in datasource_archived_df_join_repartition.rdd.collect():
item = {
'x': {'N': str(row.x)},
'y': {'N': str(row.y)}
}
client.put_item( TableName='tryfail', Item=item)
Two things to note here: first, each item corresponds to a row, so if you are putting items in a loop you must instantiate a new one with each iteration. Second, regarding the conversion of the numeric x and y into strings, the DynamoDB docs explain that the reason the AttributeValue dict requires this is "to maximize compatibility across languages and libraries. However, DynamoDB treats them as number type attributes for mathematical operations." For fuller documentation on the type system for DynamoDB take a look at this or read the Boto3 doc here since you are using Python.
The error message is indicating you are using the wrong type, it looks like you need to be using a dictionary when assigning values to item['x'] and item[y]. e.g.
item['x'] = {'value': row.x}
item['y'] = {'value': row.y}

Passing list of search string in contains in FilterExpression

Is there any way to pass a list of search strings in the contains() method of FilterExpression in DynamoDb?
Something like below:
search_str = ['value-1', 'value-2', 'value-3']
result = kb_table.scan(
FilterExpression="contains (title, :titleVal)",
ExpressionAttributeValues={ ":titleVal": search_str }
)
For now I can only think of looping through the list and scanning the table multiple times (as in below code), but I think it will be resource heavy.
for item in search_str:
result += kb_table.scan(
FilterExpression="contains (title, :titleVal)",
ExpressionAttributeValues={ ":titleVal": item }
)
Any suggestions.
For the above scenario, the CONTAINS should be used with OR condition. When you give array as input for CONTAINS, DynamoDB will check for the SET attribute ("SS", "NS", or "BS"). It doesn't looks for the sub-sequence on the string attribute.
If the target attribute of the comparison is of type String, then the
operator checks for a substring match. If the target attribute of the
comparison is of type Binary, then the operator looks for a
subsequence of the target that matches the input. If the target
attribute of the comparison is a set ("SS", "NS", or "BS"), then the
operator evaluates to true if it finds an exact match with any member
of the set.
Example:-
movies1 = "MyMovie"
movies2 = "Big New"
fe1 = Attr('title').contains(movies1)
fe2 = Attr('title').contains(movies2)
response = table.scan(
FilterExpression=fe1 or fe2
)
a little bit late but to allow people to find a solution i give here my method.
lets assume that in your DB you have a props called 'EMAIL you want to filter your scan on this EMAIL with a list of value. you can proceed as following.
list_of_elem=['mail1#mail.com','mail2#mail.com','mail3#mail.com']
#set an empty string to create your query
stringquery=""
# loop each element in your list
for index,value in enumerate(list_of_elem):
# add your query of contains with mail value
stringquery=stringquery+f"Attr('EMAIL').contains('{value }')"
# while your value is not the last element in list add the 'OR' operator
if index < len(list_of_elem)-1:
stringquery=stringquery+ ' | '
dynamodb = boto3.resource('dynamodb')
# Use eval of your query string to parse the string as filter expression
tableUser = dynamodb.Table('mytable')
tableUser.scan(
FilterExpression=eval(stringquery)
)

using dictionaries in swi-prolog

I'm working on a simple web service in Prolog and wanted to respond to my users with data formatted as JSON. A nice facility is reply_json_dict/1 which takes a dictionary and converts it in a HTTP response with well formatted JSON body.
My trouble is that building the response dictionary itself seems a little cumbersome. For example, when I return some data, I have data id but may/may not have data properties (possibly an unbound variable). At the moment I do the following:
OutDict0 = _{ id : DataId },
( nonvar(Props) -> OutDict1 = OutDict0.put(_{ attributes : Props }) ; OutDict1 = OutDict0 ),
reply_json_dict(OutDict1)
Which works fine, so output is { "id" : "III" } or { "id" : "III", "attributes" : "AAA" } depending whether or not Props is bound, but... I'm looking for an easier approach. Primarily because if I need to add more optional key/value pairs, I end up with multiple implications like:
OutDict0 = _{ id : DataId },
( nonvar(Props) -> OutDict1 = OutDict0.put(_{ attributes : Props }) ; OutDict1 = OutDict0 ),
( nonvar(Time) -> OutDict2 = OutDict1.put(_{ time : Time }) ; OutDict2 = OutDict1 ),
( nonvar(UserName) -> OutDict3 = OutDict2.put(_{ userName : UserName }) ; OutDict3 = OutDict2 ),
reply_json_dict(OutDict3)
And that seems just wrong. Is there a simpler way?
Cheers,
Jacek
Instead of messing with dictionaries, my recommendation in this case is to use a different predicate to emit JSON.
For example, consider json_write/2, which lets you emit JSON, also on current output as the HTTP libraries require.
Suppose your representation of data fields is the common Name(Value) notation that is used throughout the HTTP libraries for option processing:
Fields0 = [attributes(Props),time(Time),userName(UserName)],
Using the meta-predicate include/3, your whole example becomes:
main :-
Fields0 = [id(DataId),attributes(Props),time(Time),userName(UserName)],
include(ground, Fields0, Fields),
json_write(current_output, json(Fields)).
You can try it out yourself, by plugging in suitable values for the individual elements that are singleton variables in the snippet above.
For example, we can (arbitrarily) use:
Fields0 = [id(i9),attributes(_),time('12:00'),userName(_)],
yielding:
?- main.
{"id":"i9", "time":"12:00"}
true.
You only need to emit the suitable Content-Type header, and have the same output that reply_json_dict/1 would have given you.
You can do it in one step if you use a list to represent all values that need to go into the dict.
?- Props = [a,b,c], get_time(Time),
D0 = _{id:001},
include(ground, [props:Props,time:Time,user:UserName], Fs),
D = D0.put(Fs).
D0 = _17726{id:1},
Fs = [props:[a, b, c], time:1477557597.205908],
D = _17726{id:1, props:[a, b, c], time:1477557597.205908}.
This borrows the idea in mat's answer to use include(ground).
Many thanks mat and Boris for suggestions! I ended up with a combination of your ideas:
dict_filter_vars(DictIn, DictOut) :-
findall(Key=Value, (get_dict(Key, DictIn, Value), nonvar(Value)), Pairs),
dict_create(DictOut, _, Pairs).
Which then I can use as simple as that:
DictWithVars = _{ id : DataId, attributes : Props, time : Time, userName : UserName },
dict_filter_vars(DictWithVars, DictOut),
reply_json_dict(DictOut)

How do you remove the padding from Modifed Base 64 for URL?

This Wiki article on Base64 URL says
"For this reason, a modified Base64 for URL variant exists, where no padding '=' will be used, and the '+' and '/' characters of standard Base64 are respectively replaced by '-' and '_', so that using URL encoders/decoders is no longer necessary and has no impact on the length of the encoded value, leaving the same encoded form intact for use in relational databases, web forms, and object identifiers in general."
When I try and remove the padding using ASP.NET, I get an error when I get my query strings back. How can I account for the missing padding?
string encoded = GetBase64FromQueryString();
encoded = encoded.PadRight(NextMultiple(encoded.Length, 4), '=');
...
static int NextMultiple(int value, int multiple)
{
int r = value % multiple;
return value + (r != 0 ? multiple - r : 0);
}

Resources