How to feed object relational model postgresql database data into graphite? Is there any plugin available or is there any steps to convert the postgresql database into graphite. I am new to graphite. please tell me if i am wrong
Try https://collectd.org/ - it has a Postgres plugin https://collectd.org/wiki/index.php/Plugin:PostgreSQL that will by default collect some useful system statistics and it is very easy to extend with your own custom queries.
Related
I did tried using schedular for Databricks notebook but it is creating unnecessary of loading data. The data in mariadb changes randomly it is not fixed , if I try pipeline I cant call a trigger for change in data and transfer of data from one Database to another.
Please help me with any pipeline ideas , azure datafactory ideas or python codes as well so that I can transfer tables when their are changes in Mariadb
One way to trigger pipeline is to use event-based trigger.
Creating event-based trigger in Azure Data Factory. Create trigger and select Type as Custom events.
Refer - https://www.mssqltips.com/sqlservertip/6063/create-event-based-trigger-in-azure-data-factory/
Second way is to use logic app. This is the best approach for your query.
Refer this answer by Trent Tamura
We have a Azure data warehouse db2(Azure Synapse) that will need to be consumed by read only users around the world, and we would like to replicate the needed objects from the data warehouse potentially to a cosmos DB. Is this possible, and if so what are the available options? (transactional, merege, etc)
Synapse is mainly about getting your data to do analysis. I dont think it has a direct export option, the kind you have described above.
However, what you can do, is to use 'Azure Stream Analytics' and then you should be able to integrate/stream whatever you want to any destination you need, like an app or a database ands so on.
more details here - https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-integrate-azure-stream-analytics
I think you can also pull the data into BI, and perhaps setup some kind of a automatic export from there.
more details here - https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-get-started-visualize-with-power-bi
We need to connect to on premise Teradata from Azure Databricks .
Is that possible at all ?
If yes please let me know how .
I was looking for this information as well and I recently was able to access our Teradata instance from Databricks. Here is how I was able to do it.
Step 1. Check your cloud connectivity.
%sh nc -vz 'jdbcHostname' 'jdbcPort'
- 'jdbcHostName' is your Teradata server.
- 'jdbcPort' is your Teradata server listening port. By default, Teradata listens to the TCP port 1025
Also check out Databrick’s best practice on connecting to another infrastructure.
Step 2. Install Teradata JDBC driver.
Teradata Downloads page provides JDBC drivers by version and archive type. You can also check the Teradata JDBC Driver Supported Platforms page to make sure you pick the right version of the driver.
Databricks offers multiple ways to install a JDBC library JAR for databases whose drivers are not available in Databricks. Please refer to the Databricks Libraries to learn more and pick the one that is right for you.
Once installed, you should see it listed in the Cluster details page under the Libraries tab.
Terajdbc4.jar dbfs:/workspace/libs/terajdbc4.jar
Step 3. Connect to Teradata from Databricks.
You can define some variables to let us programmatically create these connections. Since my instance required LDAP, I added LOGMECH=LDAP in the URL. Without LOGMECH=LDAP it returns “username or password invalid” error message.
(Replace the text in italic to the values in your environment)
driver = “com.teradata.jdbc.TeraDriver”
url = “jdbc:teradata://Teradata_database_server/Database=Teradata_database_name,LOGMECH=LDAP”
table = “Teradata_schema.Teradata_tablename_or_viewname”
user = “your_username”
password = “your_password”
Now that the connection variables are specified, you can create a DataFrame. You can also explicitly set this to a particular schema if you have one already. Please refer to Spark SQL Guide for more information.
Now, let’s create a DataFrame in Python.
My_remote_table = spark.read.format(“jdbc”)\
.option(“driver”, driver)\
.option(“url”, url)\
.option(“dbtable”, table)\
.option(“user”, user)\
.option(“password”, password)\
.load()
Now that the DataFrame is created, it can be queried. For instance, you can select some particular columns to select and display within Databricks.
display(My_remote_table.select(“EXAMPLE_COLUMN”))
Step 4. Create a temporary view or a permanent table.
My_remote_table.createOrReplaceTempView(“YOUR_TEMP_VIEW_NAME”)
or
My_remote_table.write.format(“parquet”).saveAsTable(“MY_PERMANENT_TABLE_NAME”)
Step 3 and 4 can also be combined if the intention is to simply create a table in Databricks from Teradata. Check out the Databricks documentation SQL Databases Using JDBC for other options.
Here is a link to the write-up I published on this topic.
Accessing Teradata from Databricks for Rapid Experimentation in Data Science and Analytics Projects
If you create a virtual network that can connect to on prem then you can deploy your databricks instance into that vnet. See https://docs.azuredatabricks.net/administration-guide/cloud-configurations/azure/vnet-inject.html.
I assume that there is a spark connector for terradata. I haven't used it myself but I'm sure one exists.
You can't. If you run Azure Databricks, all the data needs to be stored in Azure. But you can call the data using REST API from Teradata and then save data in Azure.
I would like to know if there's a way to connect to or see the database behind alfresco community.
Is there any way to access alfresco database?
The information necessary to connect to the database is stored in the file
tomcat/shared/classes/alfresco-global.properties
in your Alfresco installation directory. The relevant configuration looks like this (for PostgreSQL):
### database connection properties ###
db.driver=org.postgresql.Driver
db.username=alfresco
db.password=secret*password
db.name=alfresco
db.url=jdbc:postgresql://localhost/alfresco
This allows you to connect to the PostgreSQL database using a a client of your choice.
Edit: The Alfresco database is highly normalized. Some tables to start at are:
alf_store
alf_node
alf_node_properties
There are many foreign key constraints that reference other columns, for example:
alf_node.store_id references alf_store.id
alf_node_properties.node_id references alf_node.id
If you want to get specific data from the database and don't know how, please ask a new question.
My source data is in Oracle and target data is in Teradata.Can you please provide me the easy and quick way to validate data .There are 900 tables.If possible can you provide syntax too
There is a product available known as the Teradata Gateway that works with Oracle and allows you to access Teradata in a "heterogeneous" manner. This may not be the most effective way to compare the data.
Ultimately what your requirements sound more process driven and to be done effectively would require the source data to be compared/validated as stage tables on the Teradata environment after your ETL/ELT process has completed.