I am new to dynamodb, and I came across sparse indexes. I think they fit for what I need, but I am not completely sure how to implement them. In my case I have a table with a Post entity that has the following fields, it looks like this:
post_id <string> | user_id <string> | tags <string[]> | public <boolean>| other post data attributes...
The queries that I need to do are:
Get all posts that are marked public
Get posts filtered by tags
Get all posts filtered by user, both public and not public
Get a single post
For case of getting all public posts I think sparse indexes could work. I could set the public attribute only to the entities that are marked with public. But, how does a query look like then?
I am not even sure if I have set up the DB correctly. I am using serverless framework and this is what I have come up, but not sure if that is good.
PostsDynamoDBTable:
Type: 'AWS::DynamoDB::Table'
Properties:
AttributeDefinitions:
- AttributeName: postId
AttributeType: S
- AttributeName: userId
AttributeType: S
- AttributeName: createdAt
AttributeType: S
KeySchema:
- AttributeName: postId
KeyType: HASH
- AttributeName: userId
KeyType: RANGE
BillingMode: PAY_PER_REQUEST
TableName: ${self:provider.environment.POSTS_TABLE}
GlobalSecondaryIndexes:
- IndexName: ${self:provider.environment.USER_ID_INDEX}
KeySchema:
- AttributeName: userId
KeyType: HASH
- AttributeName: public
KeyType: RANGE
Projection:
ProjectionType: ALL
A sparse index could work if you only want a subset of attributes, you'd want to use ProjectionType: INCLUDES to include the non-key attributes into the sparse index (in your case: public attributes). It's important to note that the only attributes you can access in a query to a sparse index are the attributes you explicitly include.
Firstly, you'd need to declare those public attributes in your Attribute Definitions.
Say for example, one of the public attributes is userName.
You'd want to add:
- AttributeName: userName
AttributeType: S
Then, in the GlobalSecondaryIndexes block:
GlobalSecondaryIndexes:
- IndexName: byUserIdAndPublic // You can name this whatever you'd like
KeySchema:
- AttributeName: userId
KeyType: HASH
- AttributeName: public
KeyType: RANGE
Projection:
NonKeyAttributes:
- userName
ProjectionType: INCLUDE
Then you simply query that index specifically - in the response, you'll get back the userName (no special query changes are required, except to specify using this index).
If you need all attributes for each post, you'd want to use ProjectionType: ALL, (and then just remove the NonKeyAttributes bit)
Here's an example nodejs method that would filter posts by public (public being a boolean passed into the method):
const listByUserIdAndPublic = async (userId, public) => {
const params = {
TableName: tableName,
IndexName: 'byUserIdAndPublic',
KeyConditionExpression: '#public = :public AND #userId = :userId',
ExpressionAttributeNames: {
'#userId': 'userId',
'#public': 'public'
},
ExpressionAttributeValues: {
':public': public,
':userId': userId
},
};
const response = await documentClient.query(params).promise();
return response.Items;
};
Based on your question:
Get all posts that are marked public
I think you'd want to use a full index (ProjectionType: ALL), because I'd imagine you want all attributes of a post to be indexed.
Related
I'm trying to get the related/child items with a datastore query. I have no problems getting them with the graphql API, but it returns a bunch of deleted items which i am unable to filter server side (and keep pagination working nicely).
I'm using react/next/javascript.
``
I have the following models in my schema:
type TestResultData #model #auth(rules: [{allow: public}, {allow: owner, operations: [create, read, update]}, {allow: groups, groups: ["admin"], operations: [read, create, update, delete]}]) {
id: ID!
name: String
value: String
unit: String
testresultsID: ID! #index(name: "byTestResults")
TestResultAnalyses: [TestResultAnalysis] #hasMany(indexName: "byTestResultData", fields: ["id"])
tests: [Test] #manyToMany(relationName: "TestTestResultData")
}
and
type TestResults #model #auth(rules: [{allow: public}, {allow: owner, operations: [create, read, update]}, {allow: groups, groups: ["admin"], operations: [read, create, update, delete]}]) {
id: ID!
CustomerID: ID! #index(name: "byCustomer")
lab: String
fasting: Boolean
dateReported: AWSDateTime
dateCollected: AWSDateTime
dateTested: AWSDateTime
type: [TestType]
note: String
UploadedFiles: [UploadedFiles] #hasMany(indexName: "byTestResults", fields: ["id"])
TestResultData: [TestResultData] #hasMany(indexName: "byTestResults", fields: ["id"])
}
and I would like to query my TestResults model and have it return the nested TestResultData. However, datastore does not seem to return the related items. (if i do the query using the graphql api it works perfectly, except it contains all my deleted items, which i cannot filter)
this command gets me the TestResults without child items
const data = await DataStore.query(TestResults);
I've also tried "querying relations" as per:
https://docs.amplify.aws/lib/datastore/relational/q/platform/js/#updated-schema
but it doesn't work. I've also upgraded to the latest version of amplify.
I'm trying to use Redux-Toolkit's createEntityAdapter in an entity that has compound keys. For example, my territories entity has a type (municipality, state, biome, etc) and a geocode number. The pair {type, geocode} is the compound key of this entity.
I want to be able to use selectById and other selectors. My first thought was to create an id field that concatenates type, ";" and geocode, but I'm sure there's a better way.
import { createEntityAdapter } from '#reduxjs/toolkit'
const adapter = createEntityAdapter({
// selectId: (item) => ???,
})
const APIresponse = {
data: [
{ type: 'state', geocode: 1, value: 123},
{ type: 'state', geocode: 2, value: 66},
{ type: 'municipality', geocode: 1, value: 77},
{ type: 'municipality', geocode: 2, value: 88},
{ type: 'municipality', geocode: 3, value: 99}
]
}
I'm a Redux maintainer and the person who implemented createEntityAdapter for RTK.
createEntityAdapter does assume that you have some kind of unique ID field in your data. If you don't have a unique ID field from the original data, you've got three options I can think of:
Generate a unique ID for each item when you are processing the API response (but before any "loaded" action is dispatched)
Concatenate together some combination of fields to create a synthesized id field when you are processing the API response
Implement selectId so that it returns a combination of fields each time
I'm starting a new project using AWS Amplify, and have some trouble to correctly define my schema for my use case with nested objects
I have the following schema
type Company
#model
{
id: ID!
name: String!
teams: [Team] #connection (name: "CompanyTeams", sortField: "name")
}
type Team
#model
{
id: ID!
name: String!
users: [User] #connection (name: "TeamUsers", sortField: "createdAt")
company: Building #connection (name: "CompanyTeams", sortField: "name")
teamCompanyId: ID!
}
type User
#model
{
id: ID!
createdAt: String
name: String!
email: String!
team: Unit #connection (name: "TeamUsers")
}
I would like to be able to query a list of teams based on a list of companies and a user name.
For example, get all teams in companies A/B/C with user name starts with "David".
Is my current schema fine for that?
I can easily retrieve a list of teams based on the company with this kind of query
query listTeams {
listTeams(filter: {
teamCompanyId: {
eq:"A"
}
}) {
items {
name
}
}
}
But not sure how to include search on the user model. Should I override the filter and add a new custom resolver including the new filters?
Also is using filters the best solution? I believe with DynamoDB filter is only applied after we got results from the query or scan. And due to the limitation of 1mb, it might introduce a lot of reads to retrieve some results?
Happy to get any insights as I'm relatively new with AppSync / GraphQL and DynamoDB.
Thanks.
I am using the Serverless framework to setup the below table:
provider:
name: aws
runtime: nodejs6.10
stage: dev
region: ap-southeast-1
environment:
DYNAMODB_TABLE: "Itinerary-${self:provider.stage}"
DYNAMODB_COUNTRY_INDEX: country_index
iamRoleStatements:
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:Scan
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:UpdateItem
- dynamodb:DeleteItem
Resource: "arn:aws:dynamodb:${opt:region, self:provider.region}:*:table/${self:provider.environment.DYNAMODB_TABLE}"
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:Scan
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:UpdateItem
- dynamodb:DeleteItem
Resource: "arn:aws:dynamodb:${opt:region, self:provider.region}:*:table/${self:provider.environment.DYNAMODB_TABLE}/index/${self:provider.environment.DYNAMODB_COUNTRY_INDEX}"
resources:
Resources:
ItineraryTable:
Type: 'AWS::DynamoDB::Table'
DeletionPolicy: Retain
Properties:
AttributeDefinitions:
-
AttributeName: ID
AttributeType: S
-
AttributeName: Country
AttributeType: S
KeySchema:
-
AttributeName: ID
KeyType: HASH
GlobalSecondaryIndexes:
-
IndexName: ${self:provider.environment.DYNAMODB_COUNTRY_INDEX}
KeySchema:
-
AttributeName: Country
KeyType: HASH
Projection:
ProjectionType: ALL
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
TableName: ${self:provider.environment.DYNAMODB_TABLE}
Now, I am able to get to do a "get" to retrieve items from the DB by ID. However, if I try to do a "query" on the GSI, it returns nothing. There is no failure, but I never get data back.
The below is what my query looks like:
const dynamoDb = new AWS.DynamoDB.DocumentClient();
var params = {
TableName: process.env.DYNAMODB_TABLE, // maps back to the serverless config variable above
IndexName: process.env.DYNAMODB_COUNTRY_INDEX, // maps back to the serverless config variable above
KeyConditionExpression: "Country=:con",
ExpressionAttributeValues: { ":con" : "Portugal" }
};
dynamoDb.query(params, (error, result) => {
// handle potential errors
if (error) {
console.error(error);
callback(null, {
statusCode: error.statusCode || 501,
headers: { 'Content-Type': 'text/plain' },
body: 'Couldn\'t fetch the Itineraries'+error,
});
return;
}
}
var json_result = JSON.stringify(result.Item);
I should add here that I can't get data if I do a filterless "scan" as well. If I try to search for items by index (or otherwise) on the AWS Dynamo web portal, I get results though.
I cannot figure out what it is that's going wrong here. Any light that someone can shed would be much appreciated.
Ok, so I figured out the problem. There was a key statement missing in the question that I posted (because I thought it wasn't relevant), but which turned out to the problem.
I was just stringifying the results from the query using:
var json_result = JSON.stringify(result.Item);
The above works for a "get", but for a query, it needs to be result.Items:
var json_result = JSON.stringify(result.Items);
Silly on my part. Thank you for the help!
P.S. I have added the statement to the original question to be clear
I am new to DynamoDB. I have two tables:
country
city
I wish to join both tables via the country_id primary key and foreign key. So can I do this in DynamoDB?
Amazon DynamoDB is a NoSQL database. This means that traditional relational database concepts such as JOINs and Foreign Keys are not available.
Your application would be responsible for "joining" tables. That is, you would need to read values from both tables and determine a relationship between them within your application. DynamoDB is not able to do this for you.
Alternatively, you could use a system such as Amazon EMR, which provides Hadoop. You could use Hadoop and Hive to access a DynamoDB table by using HiveQL (which is similar to SQL). However, this is a very slow operation and might be too complex for your particular requirement.
As #John said, Amazon DynamoDB is a NoSQL database. So it's impossible do this at one query. But, Of course, you can get the data through multiple queries if you want. Below is dynamoDB table scheme.(It's also the table creation code)
import { CreateTableInput } from 'aws-sdk/clients/dynamodb';
import AWS from 'aws-sdk'
const paramCountryTable: CreateTableInput = {
BillingMode: 'PAY_PER_REQUEST', // for just example
TableName: 'Country',
AttributeDefinitions: [
{ AttributeName: 'country_id', AttributeType: 'S' },
],
KeySchema: [
{ AttributeName: 'country_id', KeyType: 'HASH' },
],
}
const paramCityTable: CreateTableInput = {
BillingMode: 'PAY_PER_REQUEST', // for just example
TableName: 'City',
AttributeDefinitions: [
{ AttributeName: 'city_id', AttributeType: 'S' },
{ AttributeName: 'country_id', AttributeType: 'S' },
],
KeySchema: [
{ AttributeName: 'city_id', KeyType: 'HASH' },
],
GlobalSecondaryIndexes: [{
Projection: { ProjectionType: 'ALL' },
IndexName: 'IndexCountryId',
KeySchema: [
{ AttributeName: 'country_id', KeyType: 'HASH' },
],
}],
}
const run = async () => {
const dd = new AWS.DynamoDB({ apiVersion: '2012-08-10' })
await dd.createTable(paramCountryTable).promise()
await dd.createTable(paramCityTable).promise()
}
if (require.main === module) run()
Then, you can implement what you want through several queries as below.
const ddc = new AWS.DynamoDB.DocumentClient({ apiVersion: '2012-08-10' })
const getCountryByCityId = async (city_id: string) => {
const resultCity = await ddc.get({ TableName: 'City', Key: {city_id}}).promise()
if (!resultCity.Item) return null
const { country_id } = resultCity.Item
const resultCountry = await ddc.get({ TableName: 'Country', Key: {country_id}}).promise()
return resultCountry.Item
}
const getCityArrByCountryId = async (country_id: string) => {
const result = await ddc.query({
TableName: 'City',
IndexName: 'IndexCountryId',
ExpressionAttributeValues: { ":country_id": country_id },
KeyConditionExpression: "country_id = :country_id",
}).promise()
return result.Items
}
Note that this is not a recommended use case for NoSQL.