How to incrementally refresh Azure Analysis Services having synapse pool as source. Does "Default" processing option supports incremental refresh - azure-analysis-services

Cube: Tabular (Azure Analysis Server)---
Source DB: Azure Synapse Pool ---
Data Operation: Insert and update is happening on the source system
Issue:
Not able to get latest data from synapse pool in AAS cube, using "Default" processing option.
Clarification:
Which processing option(except full) to use, to have the latest data updated in tables in the cube?
Does "Default" processing option supports incremental refresh on tables in cube?

This question was asked/answered over on the Microsoft Q&A forums: https://learn.microsoft.com/en-us/answers/questions/251394/how-to-incrementally-refresh-azure-analysis-servic.html
To summarize:
Default processing does not refresh any data, only loads in tables that have not already been processed
Incremental processing isn't currently a supported feature in Azure Analysis Services. Instead, use partitions to segment data so you only need to reload the most recent changes.

Related

Azure Synapse replicated to Cosmos DB?

We have a Azure data warehouse db2(Azure Synapse) that will need to be consumed by read only users around the world, and we would like to replicate the needed objects from the data warehouse potentially to a cosmos DB. Is this possible, and if so what are the available options? (transactional, merege, etc)
Synapse is mainly about getting your data to do analysis. I dont think it has a direct export option, the kind you have described above.
However, what you can do, is to use 'Azure Stream Analytics' and then you should be able to integrate/stream whatever you want to any destination you need, like an app or a database ands so on.
more details here - https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-integrate-azure-stream-analytics
I think you can also pull the data into BI, and perhaps setup some kind of a automatic export from there.
more details here - https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-get-started-visualize-with-power-bi

Kafka Connector for Oracle Database Source

I want to build a Kafka Connector in order to retrieve records from a database at near real time. My database is the Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 and the tables have millions of records. First of all, I would like to add the minimum load to my database using CDC. Secondly, I would like to retrieve records based on a LastUpdate field which has value after a certain date.
Searching at the site of confluent, the only open source connector that I found was the “Kafka Connect JDBC”. I think that this connector doesn’t have CDC mechanism and it isn’t possible to retrieve millions of records when the connector starts for the first time. The alternative solution that I thought is Debezium, but there is no Debezium Oracle Connector at the site of Confluent and I believe that it is at a beta version.
Which solution would you suggest? Is something wrong to my assumptions of Kafka Connect JDBC or Debezium Connector? Is there any other solution?
For query-based CDC which is less efficient, you can use the JDBC source connector.
For log-based CDC I am aware of a couple of options however, some of them require license:
1) Attunity Replicate that allows users to use a graphical interface to create real-time data pipelines from producer systems into Apache Kafka, without having to do any manual coding or scripting. I have been using Attunity Replicate for Oracle -> Kafka for a couple of years and was very satisfied.
2) Oracle GoldenGate that requires a license
3) Oracle Log Miner that does not require any license and is used by both Attunity and kafka-connect-oracle which is is a Kafka source connector for capturing all row based DML changes from an Oracle and streaming these changes to Kafka.Change data capture logic is based on Oracle LogMiner solution.
We have numerous customers using IBM's IIDR (info sphere Data Replication) product to replicate data from Oracle databases, (as well as Z mainframe, I-series, SQL Server, etc.) into Kafka.
Regardless of which of the sources used, data can be normalized into one of many formats in Kafka. An example of an included, selectable format is...
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/tasks/kcopauditavrosinglerow.html
The solution is highly scalable and has been measured to replicate changes into the 100,000's of rows per second.
We also have a proprietary ability to reconstitute data written in parallel to Kafka back into its original source order. So, despite data having been written to numerous partitions and topics , the original total order can be known. This functionality is known as the TCC (transactionally consistent consumer).
See the video and slides here...
https://kafka-summit.org/sessions/exactly-once-replication-database-kafka-cloud/

report generation tool for zabbix

I am working on the Zabbix monitoring tool.
Could any one advise whether we have any tool to generate reports.
Not at my knowledge out-of-the box.
Zabbix is tricky because MySQL backend history tables grow extremely fast and they don't have primary keys. Our current history tables have 440+ million records and we monitor 6000 servers by Zabbix. Single table scan takes 40 minutes on the active server.
So your challenge could be splitted in three smaller challenges:
History
Denormalization is the key because joins don't work on huge history tables because you have to join history, items, functions, triggers and hosts tables.
Besides you want to evaluate global and host macros, replace {ITEM.VALUE} and {HOST.NAME} in trigger and item names/descriptions.
BTW there is experimental version of Zabbix which uses Elasticsearch for keeping history and it makes possible sorting and selecting item values by intervals. Zabbix using Elasticsearch for History Tables
My approach is to generate structures like this for every Zabbix record from history tables and dump them to the document database. Make sure you don't use buffered cursors.
{'dns_name': '',
'event_clock': 1512501556,
'event_tstano': '2017-12-05 19:19:16',
'event_value': 1,
'host_id': 10084,
'host_name': 'Zabbix Server',
'ip_address': '10.13.37.82',
'item_id': 37335,
'item_key': 'nca.test.backsync.multiple',
'item_name': 'BackSync - Test - Multiple',
'trig_chg_clock': 1512502800,
'trig_chg_tstamp': '2017-12-05 19:40:00',
'trig_id': 17206,
'trig_name': 'BackSync - TEST - Multiple - Please Ignore',
'trig_prio': 'Average',
'trig_value': 'OK'
}
Current Values
Zabbix APIs are documented pretty good and JSON is handy to dump the structure like proposed for the history. Don't expect Zabbix APIs will return more than 500 metrics / second max. We currently pull 350 metrics / second.
And finally reporting ... There are many options but you have to integrate them:
Jasper
Kibana (Elasticsearch)
Tableau
Operations Bridge Reporter (Vertica)
..
JasperReports - IMHO good "framework" for reports:
connect it with SQL data connector => you have to be familiar with SQL structure of your Zabbix DB
more generic solution will be Zabbix API data connector for JasperReports, but you have to code this data connector, because it doesn't exist
You can use the Event API to export data for reporting.
Quoting from the main reference page:
Events:
Retrieve events generated by triggers, network discovery and other
Zabbix systems for more flexible situation management or third-party
tool integration
Additionally, if you've set up IT Services & SLA, you can use the Service API to extract service % availabilities

Publish webapp to Azure as student

Alright, so I have a Microsoft Imagine account from school through which I've gotten both Azure and Microsoft Visual Studio 2017 in order to learn ASP.NET (worked with Django earlier).
So I've gone throught a whole bunch of tutorials from codeschool to virtual academy to docs.microsoft and finally got the first version of my webapp done and ready to be published to Azure.
So I look through the steps on how to publish, here's some info on that:
Subscription: Microsoft Imagine
Resource Group: <name> (northeurope)
App Service Plan:
Resource Group: <name>
Pricing Tier: Free
Location: North Europe
Status: Ready
Subscription Name: Microsoft Imagine
Click on "Explore additional azure services" (as per many tutorial instructions) and add a database, I've fortunately already created the database in Azure so I only have to connect it. Here's some info on the database (though creating it directly here generates the same error):
Resource Group: <name>
Status: Online
Location: North Europe
Subscription Name: Microsoft Imagine
Server Name: <servername>.database.windows.net
Pricing Tier: Free (5 DTUs)
Some info on the server that the server:
Resource Group: <name>
Status: Available
Location: North Europe
Status: Available
So everything looks really good and I'm ready to publish and I hit the Create-button.
Deploying: (step 0 out of 5) ...
Deploying: (step 4 out of 5) ...
ERROR
Details:
Template deployment failed. Deployment operation statuses:
Succeeded: /subscriptions/ ... /servers/mintentadbserver ()
Failed: /subscriptions/ ... /databases/Mintenta_db ()
40619: The edition 'Free' does not support the database data max size '1073741824'.
Succeeded: /subscriptions/ ... /firewallrules/AllowAllAzureIPs ()
Succeeded: /subscriptions/ ... /sites/MinTenta ()
Succeeded: /subscriptions/ ... /config/connectionstrings ()
The few duplicate questions I've found on this have close to no answers and just a few suggestions to upgrade (link1, link2).
So I suppose my question is, like many others:
1) How do you change the size of the database?
2) If that's not possible and you cannot have a database with your free account. Why would not just say that instead of using size-restrictions?
I know this question is a little bit old, but I've just ran across the same error and I also couldn't find an answer. However, I managed to work around this issue.
I was following this tutorial (https://learn.microsoft.com/en-us/azure/app-service/app-service-web-tutorial-dotnet-sqldatabase) from Microsoft, and since you mentioned the same steps and the same message error I got, I'm assuming you were doing the same thing or at least something similar.
When publishing directly from Visual Studio 2017 to Azure, VS tries to create the following resources:
App service plan
App service
SQL server
SQL database
From your error message (and mine as well), although the SQL database creation had an error, the other resources were published successfully. So, if you access Azure portal, you'll see those resources there.
Then, if you open the SQL server and click "New database", you'll be able to add a database manually - and more importantly, you'll be able to select the free option with max size of 32MB.
(In this example, the button is disabled because I've already added one database - I believe this is another limitation from the students' subscription).
Note that if you add the database manually, you'll also need to configure your connection strings. But that is quite easy:
Open your new database on Azure portal
Go to Settings > Connection Strings
Copy the connection string from there
Now open your App service and go to Settings > Application Settings
On Connection Strings, add a new one or edit the existing one, pasting the content that you just copied from the DB (don't forget to input your username and password)
You can have a DB using a trial (there are no restrictions to trial account as far as I'm aware of, well, except money). I'm not sure how to workaround this issue, as the template is pre-built by VS.
The more I look at this error, the more I don't get it. There is no "Free" tier of the Azure SQL DB. And the cheapest (basic) supports up to 2GB database, so this doesn't really restrict you.
Try setting appservice plan to shared? if that doesn't help try deleting everything and just let VS create all the resources for you, it should work in that case.

Can I read and write to a SQLite database concurrently from multiple connections?

I have a SQLite database that is used by two processes. I am wondering, with the most recent version of SQLite, while one process (connection) starts a transaction to write to the database will the other process be able to read from the database simultaneously?
I collected information from various sources, mostly from sqlite.org, and put them together:
First, by default, multiple processes can have the same SQLite database open at the same time, and several read accesses can be satisfied in parallel.
In case of writing, a single write to the database locks the database for a short time, nothing, even reading, can access the database file at all.
Beginning with version 3.7.0, a new “Write Ahead Logging” (WAL) option is available, in which reading and writing can proceed concurrently.
By default, WAL is not enabled. To turn WAL on, refer to the SQLite documentation.
SQLite3 explicitly allows multiple connections:
(5) Can multiple applications or multiple instances of the same
application access a single database file at the same time?
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
For sharing connections, use SQLite3 shared cache:
Starting with version 3.3.0, SQLite includes a special "shared-cache"
mode (disabled by default)
In version 3.5.0, shared-cache mode was modified so that the same
cache can be shared across an entire process rather than just within a
single thread.
5.0 Enabling Shared-Cache Mode
Shared-cache mode is enabled on a per-process basis. Using the C
interface, the following API can be used to globally enable or disable
shared-cache mode:
int sqlite3_enable_shared_cache(int);
Each call sqlite3_enable_shared_cache() effects subsequent database
connections created using sqlite3_open(), sqlite3_open16(), or
sqlite3_open_v2(). Database connections that already exist are
unaffected. Each call to sqlite3_enable_shared_cache() overrides all
previous calls within the same process.
I had a similar code architecture as you. I used a single SQLite database which process A read from, while process B wrote to it concurrently based on events. (In python 3.10.2 using the most up to date sqlite3 version). Process B was continually updating the database, while process A was reading from it to check data. My issue was that it was working in debug mode, but not in "release" mode.
In order to solve my particular problem I used Write Ahead Logging, which is referenced in previous answers. After creating my database in Process B (write mode) I added the line:
cur.execute('PRAGMA journal_mode=wal') where cur is the cursor object created from establishing connection.
This set the journal to wal mode which allows for concurrent access for multiple reads (but only one write). In Process A, where I was reading the data, before connecting to the same database I included:
time.sleep(0.5)
Setting a sleep timer before a connection was made to the same database fixed my issue with it not working in "release" mode.
In my case: I did not have to manually set any checkpoints, locks, or transactions. Your use case might be different than mine however, so research is most likely required. Nevertheless, I hope this post helps and saves everyone some time!

Resources