Table Size in ADX - azure-data-explorer

I am trying to get list of all tables size in Azure Data explorer(ADX).
Do we have any single query or table having all tables size meta data ?
I can able to see data using below query:
.show table dev_adls_la_parsed extents;
let tbl_size = $command_results
| summarize num=sum(ExtentSize) by DatabaseName, TableName| extend SizeinGB = format_bytes(num, 2)
| project DatabaseName, TableName, SizeinGB;
tbl_size
| project DatabaseName, TableName, SizeinGB;
Output:
Using below Query trying to store data in one table for better visibility.
.create table adx_tables_space(databaseName:string, tableName:string, SizeinGB:string)
.show table dev_adls_la_parsed extents;
let tbl_size = $command_results
| summarize num=sum(ExtentSize) by DatabaseName, TableName| extend SizeinGB = format_bytes(num, 2)
| project DatabaseName, TableName, SizeinGB;
.set-or-append adx_tables_space <|
tbl_size
| project DatabaseName, TableName, SizeinGB;
It throwing some error:
Syntax Error
A recognition error occurred.
Token: .
Line: 12, Position: 0
clientRequestId: KustoWebV2;xxxxxxxxxxxxxxxxxxxxxxx

.show database extents
!! This command is undocumented and might be changed/deprecated in the future.
.show database extents
| summarize Extents = count()
,RowCount = sum(RowCount)
,OriginalSize = format_bytes(sum(OriginalSize), 2)
,ExtentSize = format_bytes(sum(ExtentSize), 2)
,CompressedSize = format_bytes(sum(CompressedSize), 2)
,IndexSize = format_bytes(sum(IndexSize), 2)
by TableName
| order by RowCount
TableName
Extents
RowCount
OriginalSize
ExtentSize
CompressedSize
IndexSize
Trips
100
1547471776
475.79 GB
100.3 GB
78.95 GB
21.35 GB
FHV_Trips
34
514304551
37.91 GB
5.92 GB
5.78 GB
146.13 MB
nyc_taxi
11
165114361
25.29 GB
7.43 GB
7.34 GB
95.35 MB
GeoRegions
1
5139969
250.35 MB
18.79 MB
12.94 MB
5.85 MB
demo_many_series1
1
2177472
153.7 MB
12.21 MB
9.01 MB
3.21 MB
...
Fiddle

Related

How to fix mariadb replication: Could not execute Delete_rows_v1 event on table db.tableName; Index for table './db/tableName.MYI'

We have 2 mariadb servers in replication (master-slave). Those 2 servers were turn off unexpectedly. When mariadb servers get online myisam tables were checked on db1 and db2:
| 85 | db | ip:55336 | db | Query | 4398 | Checking table | tableName | 0.000 |
I have changed a slave to read master binary log from new file and new position (I think there was no lag on replication) but when I start slave on db2 I got replication error:
Could not execute Delete_rows_v1 event on table db.tableName; Index for table './db/tableName.MYI' is corrupt; try to repair it, Error_code: 126; handler error HA_ERR_WRONG_IN_RECORD; the event's master log binlog-file-01, end_log_pos 8980
Can you help me how can I fix it?
we deleted all rows in db1 for this table. Should I remove all rows for this table on db2? and then skip all steps in replication which are connected with that? there was lot of this rows.
Additionally:
On DB2:
Exec_Master_Log_Pos: 810
When I read event logs for this file on db1:
MariaDB [(none)]> SHOW BINLOG EVENTS IN 'binlog-file-01' from 810 limit 5;
+---------------------------+------+----------------+-----------+-------------+------------------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+---------------------------+------+----------------+-----------+-------------+------------------------------------------------+
| binlog-file-01 | 810 | Gtid | 1 | 852 | BEGIN GTID 0-1-5630806796 |
| binlog-file-01 | 852 | Annotate_rows | 1 | 908 | DELETE FROM tableName |
| binlog-file-01 | 908 | Table_map | 1 | 993 | table_id: 107 (db.tableName) |
| binlog-file-01 | 993 | Delete_rows_v1 | 1 | 8980 | table_id: 107 |
| binlog-file-01 | 8980 | Delete_rows_v1 | 1 | 17006 | table_id: 107 |
+---------------------------+------+----------------+-----------+-------------+------------------------------------------------+
5 rows in set (0.02 sec)
Can I just skip this on replication?
./db/tableName.MYI indicates a corrupt index. If check table db.tableName didn't resolve this you can try:
drop and recreate the indexes on this table on the replica
truncate table tableName (as the binary log appears to be a delete from tableName without a where clause.
If the replication still fails on this binary log entry you can skip it with:
SET GLOBAL sql_slave_skip_counter=1
reference: ref.
Recommend changing your MyISAM tables to Aria or InnoDB to be crash safe in the case of power failure. Also set sync_binlog=1 for persistent safety.

ERROR MY-011542 Repl Plugin group_replication reported: 'Table repl_test does not have any PRIMARY KEY

I installed an InnoDB Cluster recently and trying to create a table without any primary key or equivalent to test the cluster index concept where "InnoDB internally generates a hidden clustered index named GEN_CLUST_INDEX on a synthetic column containing row ID values if the table has no PRIMARY KEY or suitable UNIQUE index".
I created table as below:
create table repl_test (Name varchar(10));
Checked for the creation of GEN_CLUST_INDEX:
select * from mysql.innodb_index_stats where database_name='test' and table_name = 'repl_test';
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| test | repl_test | GEN_CLUST_INDEX | 2019-02-22 06:29:26 | n_diff_pfx01 | 0 | 1 | DB_ROW_ID |
| test | repl_test | GEN_CLUST_INDEX | 2019-02-22 06:29:26 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index |
| test | repl_test | GEN_CLUST_INDEX | 2019-02-22 06:29:26 | size | 1 | NULL | Number of pages in the index |
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
3 rows in set (0.00 sec)
But, when I try to insert row, I get the below error:
insert into repl_test values ('John');
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.
2019-02-22T14:32:53.177700Z 594022 [ERROR] [MY-011542] [Repl] Plugin group_replication reported: 'Table repl_test does not have any PRIMARY KEY. This is not compatible with Group Replication.'
Below is my conf file:
[client]
port = 3306
socket = /tmp/mysql.sock
[mysqld_safe]
socket = /tmp/mysql.sock
[mysqld]
socket = /tmp/mysql.sock
port = 3306
basedir = /mysql/product/8.0/TEST
datadir = /mysql/data/TEST/innodb_data
log-error = /mysql/admin/TEST/innodb_logs/mysql.log
log_bin = /mysql/binlog/TEST/innodb_logs/mysql-bin
server-id=1
max_connections = 500
open_files_limit = 65535
expire_logs_days = 15
innodb_flush_log_at_trx_commit=1
sync_binlog=1
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
binlog_checksum = NONE
enforce_gtid_consistency = ON
gtid_mode = ON
relay-log=<<hostname>>-relay-bin
My MySQL version: mysql Ver 8.0.11 for linux-glibc2.12 on x86_64 (MySQL Community Server - GPL)
Auto-generation of the PK should work fine for non-clustered setups. However, InnoDB Cluster (Group Replication) and Galera need a user PK (or UNIQUE that can be promoted).
If you like, file a documentation bug with bugs.mysql.com complaining that this restriction is not clear. Be sure to point to the page that needs fixing.

Load data from R into PosgreSQL database without losing constraints

I'm trying to import some dataframe from R into my PostgreSQL database. The tables are defined as follows:
CREATE TABLE fact_citedpubs(
citedpubs_id serial CONSTRAINT fact_citedpubs_pk PRIMARY KEY NOT NULL,
originID integer REFERENCES dim_country(country_id),
yearID integer REFERENCES dim_year(year_id),
citecount double precision
);
In my dataframe I have values for originID, yearID and citecount. My dataframe looks like this
| YEAR | GEO_DESC |OBS_VALUE
| 8 | 1 | 13.29400
| 17 | 2 | 4.42005
| 17 | 1 | 12.95001
| 15 | 1 | 11.61365
| 14 | 1 | 13.48174
To import this dataframe in the postgresql database I use the function dbWriteTable(con, 'fact_citedpubs', citations, overwrite = TRUE)
Because of using overwrite = TRUE Postgresql drops all the earlier set contstraints (primary-, foreign keys and datatypes). Is there any other way to import data into a postgresql database from R while keeping the constraints that were set in advance?
Many thanks!

Avoid running a function multiple times in a query

I have the following query in Application Insights where I run the parsejson function multiple times in the same query.
Is it possible to reuse the data from the parsejson() function after the first invocation? Right now I call it three times in the query. I am trying to see if calling it just once might be more efficient.
EventLogs
| where Timestamp > ago(1h)
and tostring(parsejson(tostring(Data.JsonLog)).LogId) =~ '567890'
| project Timestamp,
fileSize = toint(parsejson(tostring(Data.JsonLog)).fileSize),
pageCount = tostring(parsejson(tostring(Data.JsonLog)).pageCount)
| limit 10
You can use extend for that:
EventLogs
| where Timestamp > ago(1h)
| extend JsonLog = parsejson(tostring(Data.JsonLog)
| where tostring(JsonLog.LogId) =~ '567890'
| project Timestamp,
fileSize = toint(JsonLog.fileSize),
pageCount = tostring(JsonLog.pageCount)
| limit 10

while query to Riak timeseries databse i am getting SQL Parser error

I am getting this as a problem:
{0,riak_ql_parser, <<"Used group as a measure of time in 712903232group. Only s, m, h and d are allowed.
My Query:
select memberId,COUNT(memberId) from Emp18 where start>1478925732000 and start< 1478925939000 and end>1478913322000 and memberId<712903232 group by memberId;
but I am getting response with following query:
select memberId,COUNT(memberId) from Emp18 where start>1478925732000 and start< 1478925939000 and end>1478913322000 and memberId<712903232;
and I am getting output as :
+---------+-----+---------------+
|memberId |steps|COUNT(memberId)|
+---------+-----+---------------+
|712903230| 350 | 4 |
+---------+-----+---------------+

Resources