Invalid type for parameter error when using put_item dynamodb - amazon-dynamodb

I want to write data in dataframe to dynamodb table
item = {}
for row in datasource_archived_df_join_repartition.rdd.collect():
item['x'] = row.x
item['y'] = row.y
client.put_item( TableName='tryfail',
Item=item)
but im gettin this error
Invalid type for parameter Item.x, value: 478.2, type: '<'type 'float''>', valid types: '<'type 'dict''>'
Invalid type for parameter Item.y, value: 696- 18C 12, type: '<'type 'unicode''>', valid types: '<'type 'dict''>'

Old question, but it still comes up high in a search and hasn't been answered properly, so here we go.
When putting an item in a DynamoDB table it must be a dictionary in a particular nested form that indicates to the database engine the data type of the value for each attribute. The form looks like below. The way to think of this is that an AttributeValue is not a bare variable value but a combination of that value and its type. For example, an AttributeValue for the AlbumTitle attribute below is the dict {'S': 'Somewhat Famous'} where the 'S' indicates a string type.
response = client.put_item(
TableName='Music',
Item={
'AlbumTitle': { # <-------------- Attribute
'S': 'Somewhat Famous', # <-- Attribute Value with type string ('S')
},
'Artist': {
'S': 'No One You Know',
},
'SongTitle': {
'S': 'Call Me Today',
},
'Year': {
'N': '2021' # <----------- Note that numeric values are supplied as strings
}
}
)
In your case (assuming x and y are numbers) you might want something like this:
for row in datasource_archived_df_join_repartition.rdd.collect():
item = {
'x': {'N': str(row.x)},
'y': {'N': str(row.y)}
}
client.put_item( TableName='tryfail', Item=item)
Two things to note here: first, each item corresponds to a row, so if you are putting items in a loop you must instantiate a new one with each iteration. Second, regarding the conversion of the numeric x and y into strings, the DynamoDB docs explain that the reason the AttributeValue dict requires this is "to maximize compatibility across languages and libraries. However, DynamoDB treats them as number type attributes for mathematical operations." For fuller documentation on the type system for DynamoDB take a look at this or read the Boto3 doc here since you are using Python.

The error message is indicating you are using the wrong type, it looks like you need to be using a dictionary when assigning values to item['x'] and item[y]. e.g.
item['x'] = {'value': row.x}
item['y'] = {'value': row.y}

Related

Does Boto3 DynamoDB have reserved attribute names for update_item with conditions expressions? Unexpected attribute SET behavior

I've implemented a simple object versioning scheme that allows the calling code to supply a current_version integer that that will set the ConditionExpression. I've also implemented a simple timestamping scheme to set an attribute named auto_timestamp to the current unix timestamp.
When the ConditionExpression is supplied with the object's current version integer, the update occurs, but also sets auto_timestamp to the current version value, rather than the value supplied in ExpressionAttributeValues. This only occurs if the attribute names are #a0, #a1 ... and values are :v0, :v1 ...
For example, this runs as expected without the condition, and auto_timestamp is set to 1643476414 in the table. The if_not_exists is used to start the object version at 0 if the item does not yet exist or did not previously have a auto_object_version attribute.
update_kwargs = {
"Key": {"user_id": user_id},
"UpdateExpression": 'SET #a0 = :v0, #a1 = if_not_exists(#a1, :zero) + :v1',
"ExpressionAttributeNames": {"#a0": "auto_timestamp", "#a1": "auto_object_version"},
"ExpressionAttributeValues": {":v0": 1643476414, ":v1": 1, ":zero": 0}
}
table.update_item(**update_kwargs)
However, this example runs without exception, but auto_timestamp is set to 1. This behavior continues for each subsequent increment of current_version for additional calls to update_item
from boto3.dynamodb.conditions import Attr
update_kwargs = {
"Key": {"user_id": user_id},
"UpdateExpression": 'SET #a0 = :v0, #a1 = if_not_exists(#a1, :zero) + :v1',
"ExpressionAttributeNames": {"#a0": "auto_timestamp", "#a1": "auto_object_version"},
"ExpressionAttributeValues": {":v0": 1643476414, ":v1": 1, ":zero": 0}
"ConditionExpression": Attr("auto_object_version").eq(1)
}
table.update_item(**update_kwargs)
While debugging, I changed the scheme by which I am labeling the attribute names and values to use #att instead of #a and :val instead of :v and the following works as desired and auto_timestamp is set to 1643476414:
from boto3.dynamodb.conditions import Attr
update_kwargs = {
"Key": {"user_id": user_id},
"UpdateExpression": 'SET #att0 = :val0, #att1 = if_not_exists(#att1, :zero) + :val1',
"ExpressionAttributeNames": {"#att0": "auto_timestamp", "#att1": "auto_object_version"},
"ExpressionAttributeValues": {":val0": 1643476414, ":val1": 1, ":zero": 0}
"ConditionExpression": Attr("auto_object_version").eq(1)
}
table.update_item(**update_kwargs)
I couldn't find any documentation on reserved attribute names or values that shouldn't be used for keys in ExpressionAttributeNames or ExpressionAttributeValues.
Is this behavior anyone has witnessed before? The behavior is easily worked around when switching the string formatting used to generate the keys but was very unexpected.
There are no reserved attribute or value names, and I routinely use names like :v1 and #a1 in my own tests, and they seem to work fine.
Assuming you correctly copied-pasted your code into the question, it seems to me you simply have a syntax error in your code - you are missing a double-quote after the "auto_timestamp. What I don't understand, though, is how this compiles or why changing a to att changed anything. Please be more careful in pasting a self-contained code snippet that works or doesn't work.

Using Element and Split Gets First Item Rather than Last Item in Terraform

We're trying to apply a dynamic name to a firewall rule for opening 8089 and 8843 in GCP using terraform based on the list of instance group urls. Instead of taking that result and giving us the last item in the url, it gives us https:
tf:
#This is to resolve an error when deploying to nginx
resource "google_compute_firewall" "ingress" {
for_each = toset(google_container_cluster.standard-cluster.instance_group_urls)
description = "Allow traffic on ports 8843, 8089 for nginx ingress"
direction = "INGRESS"
name = element(split("/", each.key), length(each.key))
network = "https://www.googleapis.com/compute/v1/projects/${local.ws_vars["project-id"]}/global/networks/${local.ws_vars["environment"]}"
priority = 1000
source_ranges = google_container_cluster.standard-cluster.private_cluster_config.*.master_ipv4_cidr_block
target_tags = [
element(split("/", each.key), length(each.key))
]
allow {
ports = [
"8089",
]
protocol = "tcp"
}
allow {
ports = [
"8443",
]
protocol = "tcp"
}
}
Result:
Error: "name" ("https:") doesn't match regexp "^(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)$"
on main.tf line 133, in resource "google_compute_firewall" "ingress":
133: name = element(split("/", each.key), length(each.key))
What is the solution here? Why is it not giving the last item in the array? Is there a better way?
Like with many languages, Terraform/HCL uses zero based indexing so if you want the last element in an array you need to subtract one from the length like this:
locals {
list = ["foo", "bar", "baz"]
}
output "last_element" {
value = element(local.list, length(local.list) - 1)
}
The element function is causing this confusion because instead of getting an out of bounds/range error when you attempt to access beyond the length of the list it wraps around and so you are getting the first element:
The index is zero-based. This function produces an error if used with
an empty list. The index must be a non-negative integer.
Use the built-in index syntax list[index] in most cases. Use this
function only for the special additional "wrap-around" behavior
described below.
To get the last element from the list use length to find the size of
the list (minus 1 as the list is zero-based) and then pick the last
element:
> element(["a", "b", "c"], length(["a", "b", "c"])-1)
c
Unfortunately, at the time of writing, Terraform doesn't currently support negative indexes in the built-in index syntax:
locals {
list = ["foo", "bar", "baz"]
}
output "last_element" {
value = local.list[-1]
}
throws the following error:
Error: Invalid index
on main.tf line 6, in output "last_element":
6: value = local.list[-1]
|----------------
| local.list is tuple with 3 elements
The given key does not identify an element in this collection value.
As suggested in the comments, a better approach here would be to first reverse the list and then take the first element from the reversed list using the reverse function:
output "last_element" {
value = reverse(local.list)[0]
}

How to extract a string value from an array using scripted field in kibana?

Is there a way to extract a string value from an array with the use of if statement in scripted field in kibana. I tried the below code, however, I am unable to filter out the correct and incorrect values in discover tab of kibana. This might be due to remark field is an array.
def result_string = "";
if (doc['nac.keyword'].value =="existing_intent" &&doc['remark.keyword'].value != "acceptable") {
result_string = "incorrect";
}
if (doc['nac.keyword'].value =="existing_intent" &&doc['remark.keyword'].value == "acceptable") {
result_string = "correct";
}
return result_string;`
You can use the contains method defined on Array to check for element membership:
!doc['remark.keyword'].value.contains("acceptable") //does not contain
For this, you might want to ensure first that doc['remark.keyword'].value is indeed an Array.

Extract values from web service JSON response with JSONPath

I have a JSON response from web service that looks something like this :
[
{
"id":4,
"sourceID":null,
"subject":"SomeSubjectOne",
"category":"SomeCategoryTwo",
"impact":null,
"status":"completed"
},
{
"id":12,
"sourceID":null,
"subject":"SomeSubjectTwo",
"category":"SomeCategoryTwo",
"impact":null,
"status":"assigned"
}
]
What I need to do is extract the subjects from all of the entities by using JSONPATH query.
How can I get these results :
Subject from the first item - SomeSubjectOne
Filter on specific subject value from all entities (SomeSubjectTwo for example)
Get Subjects from all entities
Goessner's orinial JSONPath article is a good reference point and all implementations more or less stick to the suggested query syntax. However, implementations like Jayway JsonPath/Java, JSONPath-Plus/JavaScript, flow-jsonpath/PHP may behave a little differently in some areas. That's why it can be important to know what implementation you are actually using.
Subject from the first item
Just use an index to select the desired array element.
$.[0].subject
Returns:
SomeSubjectOne
Specific subject value
First, go for any elements .., check those with a subject [?(#.subject] and use == '..' for comparison.
$..[?(#.subject == 'SomeSubjectTwo')]
Returns
[ {
"id" : 12,
"sourceID" : null,
"subject" : "SomeSubjectTwo",
"category" : "SomeCategoryTwo",
"impact" : null,
"status" : "assigned" } ]*
Get all subjects
$.[*].subject
or simply
$..subject
Returns
[ "SomeSubjectOne", "SomeSubjectTwo" ]

Passing list of search string in contains in FilterExpression

Is there any way to pass a list of search strings in the contains() method of FilterExpression in DynamoDb?
Something like below:
search_str = ['value-1', 'value-2', 'value-3']
result = kb_table.scan(
FilterExpression="contains (title, :titleVal)",
ExpressionAttributeValues={ ":titleVal": search_str }
)
For now I can only think of looping through the list and scanning the table multiple times (as in below code), but I think it will be resource heavy.
for item in search_str:
result += kb_table.scan(
FilterExpression="contains (title, :titleVal)",
ExpressionAttributeValues={ ":titleVal": item }
)
Any suggestions.
For the above scenario, the CONTAINS should be used with OR condition. When you give array as input for CONTAINS, DynamoDB will check for the SET attribute ("SS", "NS", or "BS"). It doesn't looks for the sub-sequence on the string attribute.
If the target attribute of the comparison is of type String, then the
operator checks for a substring match. If the target attribute of the
comparison is of type Binary, then the operator looks for a
subsequence of the target that matches the input. If the target
attribute of the comparison is a set ("SS", "NS", or "BS"), then the
operator evaluates to true if it finds an exact match with any member
of the set.
Example:-
movies1 = "MyMovie"
movies2 = "Big New"
fe1 = Attr('title').contains(movies1)
fe2 = Attr('title').contains(movies2)
response = table.scan(
FilterExpression=fe1 or fe2
)
a little bit late but to allow people to find a solution i give here my method.
lets assume that in your DB you have a props called 'EMAIL you want to filter your scan on this EMAIL with a list of value. you can proceed as following.
list_of_elem=['mail1#mail.com','mail2#mail.com','mail3#mail.com']
#set an empty string to create your query
stringquery=""
# loop each element in your list
for index,value in enumerate(list_of_elem):
# add your query of contains with mail value
stringquery=stringquery+f"Attr('EMAIL').contains('{value }')"
# while your value is not the last element in list add the 'OR' operator
if index < len(list_of_elem)-1:
stringquery=stringquery+ ' | '
dynamodb = boto3.resource('dynamodb')
# Use eval of your query string to parse the string as filter expression
tableUser = dynamodb.Table('mytable')
tableUser.scan(
FilterExpression=eval(stringquery)
)

Resources