I am investigating Apache Ignite to pull data from Teradata and cache it so that I can use it to display in UI. Now, we are doing it using Cassandra but we want to move out of it for some reasons. It will be helpful if I get few templates on how it can be achieved as I am not finding relevant code sources or docs to read through.
Here is the documentation page about loading the data into Apache Ignite: https://apacheignite.readme.io/docs/data-loading
Related
I'd like to write python script which manages my google data fusion pipelines and instances (creates new, deletes, starts, etc). For that purpose I use airflow installed as library. I've read some tutorials and documentations but I still can't make that script connect with data fusion instance. I've tried to use next string:
export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fkeys%2Fkey.json&extra__google_cloud_platform__scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&extra__google_cloud_platform__project=airflow&extra__google_cloud_platform__num_retries=5'
with my data json key file and Project id but it still doesn't work. Can you give me an example of creating that connection?
You can find an example python script here:
https://airflow.readthedocs.io/en/latest/_modules/airflow/providers/google/cloud/example_dags/example_datafusion.html
This page provides a breakdown for each Data Fusion Operator if you would like to learn more about them:
https://airflow.readthedocs.io/en/latest/howto/operator/gcp/datafusion.html
I am working on migrating teradata DBs to any open source DB(which DB is under discussion). I came across Apache Drill engine. My question is can we use drill to load data from teradata? If yes, can we use it as a schema conversion tool?
In theory yes it can load data from Teradata, since Teradata has a JDBC driver you can configure Teradata as a source. For an example of how to configure a JDBC data source in Drill see the docs here.
Drill has a CTAS statement. I know it can be used to write parquet, CSV, and json files, but I'm not sure what other data sources it supports.
To get more information about what Drill can do, and to request features, please get in touch with the Drill team on the user list.
Our application has a number of tables containing reference data. We have been using the traditional Flyway approach of creating delta files for each change in data but with frequent changes its a bit hard to manage this way. It would be easier to have a script with a truncate followed by inserts to reload the table from scratch and when data changes the developer would edit this file as needed.
Is there a clean way to accomplish this in Flyway without generating checksum errors? Hopefully without creating a new version of the load script each time a change is needed.
Have you tried adding your reference data as a beforeMigrate callback script?
"Using the default settings, Flyway looks in its default locations (/sql) for the Command-line tool) for SQL files like beforeMigrate.sql, beforeEachMigrate.sql, afterEachMigrate.sql"
I am creating web service in zend framework which uses DynamoDB. So I installed DynamoDB in local. But it's not easy to use. Even for inserting data and update any data for testing for purpose I have to write a script.
Is there any DynamoDB client available for MAC ? In which we can insert/update/delete data from UI.
EDIT
Doubts
1) Do I have to run a SQL to see table data? I thought there would be GUI for this.
2) I am not able to use where clause in SQL. What if I want to see one or two records from all? Is there a way to use conditions in this?
3) All fields of a row is not visible and I am not able to scroll it horizontally ?
YES! I've finally found a solution after struggling with this myself:
Run your local dynamodb jar with the following command java -jar DynamoDBLocal.jar -dbPath . [this will create a file in whatever directory the Dynamo jar is located in].
Download SQLite Database Browser and extract/install it.
Start SQLite Database Browser
Navigate to "Open Database" from the file menu
Navigate to the directory from 1. Select the file [in this case, ****_us-east-1]
You should then see the database contents!!
Hope this helps - it's been frustrating me no end!
!! EDIT !! - in response to original question edit.
Doubts
1) Do I have to run a SQL to see table data? I thought there would be
GUI for this.
2) I am not able to use where clause in SQL. What if I want to see one
or two records from all? Is there a way to use conditions in this?
3) All fields of a row is not visible and I am not able to scroll it
horizontally ?
Yes - you can do a simple "select" statement, for instance in my examples: "SELECT * FROM tweet_item" returns me the following screenshot:
Seemingly inadvertanyl - whilst I couldn't get the direct SELECT * FROM XX WHERE XX to work, the like statement does. For instance SELECT * FROM tweet_item where tweet_item.hashKey like "%425665354447462400%" returns me the tweet with tweet_id [my hashKey] of 425665354447462400:
Strange - I seem to be able to scroll quite happily [although it is Windows not Mac]. It also automatically tries to re-size the outer frame, too.
I ran into this problem and found a relatively new solution : https://github.com/aaronshaf/dynamodb-admin
It has provision for GET/POST/PUT/DELETE.
Although its a paid product, which is a bummer, RazorSQL now supports DynamoDB as well, and does let you change the AWS endpoint to point to a local installation.
The mac version (with a free trial) is available here:
http://razorsql.com/download_mac.html
Here is a very useful ui tool https://github.com/YoyaTeam/dynamodb-manager,It supports almost all data operations。
For Eclipse users:
Amazon provides AWS Toolkit for Eclipse IDE. It can view local and cloud databases. Also if you are using different regions, it has ability to choose from different regions.
You can create attributes, add keys etc..
For installation follow this link: http://docs.aws.amazon.com/toolkit-for-eclipse/v1/user-guide/getting-started.html
Dynobase is new DynamoDB GUI Client which also lets you browse and manipulate local DynamoDB instances: https://dynobase.dev/dynamodb-local-admin-gui/
Unfortunately, it's paid but there's free 7-days trial, works on Mac, Windows and Linux: https://dynobase.dev/
I have a client who has set-up a testing environment in some AI language. It basically runs some predefined test cases and stores the results in as log files (comma separated txt files). My job is to identify and suggest a reporting system and I have these options in mind. either
1. Importing the logs into MSSQL and use the reporting(SSRS) it uses
2. or us import the logs to MySQL and use PHP to develop custom reporting.
I am thinking that going with option2 is better. The reason for this is, the logs are inconsistent and contain unexpected wild characters that normally DB's don't accept. So, I can write some scripts in php before loading them to the database.
Can anyone please suggest if this is your problem what will you suggest to do?
It depends how fancy you need to be. If the data is in CSV files, you could even go so simple as to load it into Excel (or their favorite spreadsheet tool), and use spreadsheet macros to analyze it.