Why does my Cloudtrail key have a random ID? - amazon-cloudtrail

For some reason my S3 bucket for my Cloudtrail logs stores logs in the following format:
/AWSLogs/${SOME_RANDOM_ID}/${ACCOUNT_ID}/CloudTrail/${REGION}
Where there is this SOME_RANDOM_ID, which does not show up in the create statement when I go to create my Athena Table for my Cloudtrail logs. This caused all my queries to return 0 records and until I figured out what was going on.
I don't see this in the documentation under what conditions this happens, is this to partition writes to s3 to avoid SLAs on individual keys? Why is this happenening to me? Is anybody else experiencing this issue?

#Dunedan was right, the ${SOME_RANDON_ID} is equal to the output of aws organizations describe-organization --query "Organization.Id".

Related

exporting query result in Kusto to a storage account

I get below error on running my kusto query(which i am still learning) in Azure Data Explorer. and the reason is result is more than 64 mb.. Now when I read the MS doc they mentioned we have something called Export which we can use to export the query result in a storage blob.. However I couldnt find any example as how we can do it? Did someone tried it? Can they provide a sample as how we can export to a storage account when my resulted set is more than 64 mb?
Here is what I have for now, I have data in an external table which I would like to query. so for example if my external table name is TestKubeLogs and I am quyering
external_table("TestKubeLogs")
| where category == 'xyz'
and then I get error as
The Kusto DataEngine has failed to execute a query: 'Query result set has exceeded the internal data size limit 67108864 (E_QUERY_RESULT_SET_TOO_LARGE).'
so now I am trying to export this data. How should I do it. I started writing this but how do I specify the category and table name.
.export
async compressed
to json (
h#"https://azdevstoreforlogs.blob.core.windows.net/exportinglogs;mykey==",
)
You are not in the right direction.
Regardless the source of your data, whether you're querying an external table or managed table, it does not make sense to pull large data volume to your client.
What would you do with millions of records presented on your screen?
This is something that is true to all data systems.
Please read the relevant documentation:
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/concepts/resulttruncation

Dynamo DB - batch-get-item pagination

I'm trying to run a batch-get-item from CLI and getting an error that I cannot pass more than 100 keys:
failed to satisfy constraint: Member must have length less than or equal to 100
This is the command I use
aws dynamodb batch-get-item \
--request-items file://request-items.json \
--return-consumed-capacity TOTAL > user_table_output.txt
I would like to know if there's a way I can add pagination to my query?
or is there another way I can run the query?
I have ~4000 keys which I need to query.
Thanks in advance.
You'll have to break your keys down into batches of no more than 100. Also, keep in mind that the response may not include all the items you requested if the size of the items being returned exceeds 16MB. If that happens the response will include UnprocessedKeys, which can be used to request the keys that were not retrieved.
BatchGetItem has information about the API, or you can view the AWS CLI v2 doc here.

How to test AWS 'DynamoDB' table immediately after restore from Backup

I am writing python script to automate AWS DynamoDB table backup restore test. Once the table is restored from backup i can not check(test) table size or item count in restored table immediately. As per AWS "Storage size and item count are not updated in real-time. They are updated periodically, roughly every six hours."
I also tried using "scan" on the restored table to list sample items but that's also not seems to be working.
Does anybody know what could be the work around here? Suggestion would be appreciated.
Thanks !!
I was able to achieve it by using, table scan.
client = boto3.resource('dynamodb', 'us-east-1')
table = client.Table('ddb_test_table')
response = table.scan(Limit=XX)

AWS dynamoDB - Why User and System errors not populating for each table instead of account metrics?

How to get the errors count for each table to get deep into it?
Usually errors (User & System error) occurs in each DynamoDB table but why it's categorized as 'Account metrics' in Cloud-Watch. So the value is the aggregate of all tables errors in that particular region instead table wise.
Thanks in advance.
Regards,
Garry.

Teradata set table

I have a set table in teradata , when I load duplicate records throough informatica , session fails because it tries to push duplicate records in SET table.
I want that whenever duplicate records being loaded informatica rejects them using TPT or Relation connection
can anyone help me with properties I need to set
Do you really need to keep track of what records are rejected due to duplication in the TPT logs? It seems like you are open to suggestions about TPT or relational connections, so I assume you don't really care about TPT level logs.
If this assumption is correct then you can simply put an Aggregator Transformation in the mapping and mark every field as Group By. As expected, this will add a group by clause in the generated query and eliminate duplicates in the source data.
Please try following things:
1. If you'll use fload or TPT Fast load then the utility will implicitly remove the duplicates but this utility can only be used for loading into empty tables.
2. If you are trying to load data in non-empty table then place a sorter and de-dupe your data in Informatica
3. Also try changing the flag stop on error to 0 and flag Error limit in target to -1
Please share your results with us.

Resources