Performance issue on CLOB column - oracle11g

I’m facing one issue with a table which has CLOB column.
The table is just a 15column table with one column as CLOB.
When i do SELECT on the table excluding CLOB column, it take only 15min, but if i include this column the SELECT query runs for 2hrs.
Have check the plan and found both the query with and without COLUM uses same Plan.
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 330K| 61M| 147K (1)| 00:29:34 | | |
| 1 | PARTITION RANGE ALL | | 330K| 61M| 147K (1)| 00:29:34 | 1 | 50 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| CC_CONSUMER_EV_PRFL | 330K| 61M| 147K (1)| 00:29:34 | 1 | 50 |
|* 3 | INDEX RANGE SCAN | CC_CON_EV_P_EV_TYPE_BTIDX | 337K| | 811 (1)| 00:00:10 | 1 | 50 |
Below are the stats i collected.
Stats Without CLOB Column With CLOB Column
recursive calls 0 1
db block gets 0 0
consistent gets 1374615 3131269
physical reads 103874 1042358
redo size 0 0
bytes sent via SQL*Net to client 449499347 3209044367
bytes received via SQL*Net from client 1148445 1288482930
SQL*Net roundtrips to/from client 104373 2215166
sorts (memory)
sorts (disk)
rows processed 1565567 1565567
I'm planing to perform below, is it worth to try?
1) Gather stats on the table and retry
2) compress the table and retry

Related

How to fix mariadb replication: Could not execute Delete_rows_v1 event on table db.tableName; Index for table './db/tableName.MYI'

We have 2 mariadb servers in replication (master-slave). Those 2 servers were turn off unexpectedly. When mariadb servers get online myisam tables were checked on db1 and db2:
| 85 | db | ip:55336 | db | Query | 4398 | Checking table | tableName | 0.000 |
I have changed a slave to read master binary log from new file and new position (I think there was no lag on replication) but when I start slave on db2 I got replication error:
Could not execute Delete_rows_v1 event on table db.tableName; Index for table './db/tableName.MYI' is corrupt; try to repair it, Error_code: 126; handler error HA_ERR_WRONG_IN_RECORD; the event's master log binlog-file-01, end_log_pos 8980
Can you help me how can I fix it?
we deleted all rows in db1 for this table. Should I remove all rows for this table on db2? and then skip all steps in replication which are connected with that? there was lot of this rows.
Additionally:
On DB2:
Exec_Master_Log_Pos: 810
When I read event logs for this file on db1:
MariaDB [(none)]> SHOW BINLOG EVENTS IN 'binlog-file-01' from 810 limit 5;
+---------------------------+------+----------------+-----------+-------------+------------------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+---------------------------+------+----------------+-----------+-------------+------------------------------------------------+
| binlog-file-01 | 810 | Gtid | 1 | 852 | BEGIN GTID 0-1-5630806796 |
| binlog-file-01 | 852 | Annotate_rows | 1 | 908 | DELETE FROM tableName |
| binlog-file-01 | 908 | Table_map | 1 | 993 | table_id: 107 (db.tableName) |
| binlog-file-01 | 993 | Delete_rows_v1 | 1 | 8980 | table_id: 107 |
| binlog-file-01 | 8980 | Delete_rows_v1 | 1 | 17006 | table_id: 107 |
+---------------------------+------+----------------+-----------+-------------+------------------------------------------------+
5 rows in set (0.02 sec)
Can I just skip this on replication?
./db/tableName.MYI indicates a corrupt index. If check table db.tableName didn't resolve this you can try:
drop and recreate the indexes on this table on the replica
truncate table tableName (as the binary log appears to be a delete from tableName without a where clause.
If the replication still fails on this binary log entry you can skip it with:
SET GLOBAL sql_slave_skip_counter=1
reference: ref.
Recommend changing your MyISAM tables to Aria or InnoDB to be crash safe in the case of power failure. Also set sync_binlog=1 for persistent safety.

How is availability zone list order determined by the nova api in openstack?

I want to change the default option for availability zone in my openstack setup in horizon. However, I am having trouble finding out what determines the order of the availability zones as returned by the nova api. For example, running openstack availability zone list I get:
+--------------+-------------+
| Zone Name | Zone Status |
+--------------+-------------+
| zone2 | available |
| zone1 | available |
| internal | available |
| zone3 | available |
+--------------+-------------+
which is the same order as in horizon's dropdown box. However, querying the database directly, I get:
mysql> select * from aggregate_metadata;
+---------------------+------------+------------+----+--------------+-------------------+--------------+---------+
| created_at | updated_at | deleted_at | id | aggregate_id | key | value | deleted |
+---------------------+------------+------------+----+--------------+-------------------+--------------+---------+
| 2015-06-12 08:43:07 | NULL | NULL | 1 | 1 | availability_zone | zone1 | 0 |
| 2015-06-12 08:43:08 | NULL | NULL | 2 | 2 | availability_zone | zone2 | 0 |
| 2015-10-26 05:30:15 | NULL | NULL | 3 | 3 | availability_zone | zone3 | 0 |
+---------------------+------------+------------+----+--------------+-------------------+--------------+---------+
3 rows in set (0.00 sec)
Obviously, the openstack api is doing some sorting before returning the result... however, I can't figure out how it is being sorted nor how I could control the sorting.
get_availability_zones is the function used by nova api to collect list of availability zones.
This function gets list of available services(which is sorted based on the id) ,adds availability zone name is added to those services.
Since service list is the first step it's id defines the order and not the zone name.
The sort order can be modified in different ways based on the requirement.
Sort the order at frontend (horizon)
Modify this line with
ng-options="zone.value as zone.label for zone in model.availabilityZones | orderBy:'value'"
Sort the order at backend (nova-api)
Add available_zones.sort()not_available_zones.sort() before return statements in get_availability_zones function

What database schema to use for storing survey answers

I'm required for designing a survey system for our customer.
It's based on asp.net, and the database used is oracle.
I've no experience here so I'd like to ask for advice about:
What database schema to use for storing user answers, I'm afraid my current design is likely to have performance issue...
About the survey:
There'll be two or more surveys going on at the same time.
Surveys may be triggered once a year or more frequently, so I think I need a Survey Period table.
Surveys are targeting different products, so there'll be a mapping between products and surveys
Currently my design:
Survey Category table
+------------+--------------+
| CatageryId | CatageryName |
+------------+--------------+
| 1 | cat1 |
| 2 | cat2 |
+------------+--------------+
Survey Category version table
+-----------+------------+--------------------+
| VersionId | CatageryId | VersionDescription |
+-----------+------------+--------------------+
| 1 | 1 | 'cat1 version1' |
| 2 | 1 | 'cat1 version2' |
| 3 | 2 | 'cat2 version1' |
+-----------+------------+--------------------+
Survey Period Table
+----------+--------------------+
| PeriodId | PeriodDescription |
+----------+--------------------+
| 1 | 'cat1 period2016' |
| 2 | 'cat1 period2017' |
| 3 | 'cat2 period2016' |
+----------+--------------------+
Survey Period-Version map table
+----------+-----------+
| PeriodId | VersionId |
+----------+-----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
+----------+-----------+
A Version-Question map table
+--------------+------------+
| VersionId | | QuestionId |
+--------------+------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 3 | 1 |
+--------------+------------+
A Version-Product map table
+-----------+-----------+
| VersionId | ProductId |
+-----------+-----------+
| 1 | 'prodA' |
| 1 | 'prodB' |
| 1 | 'prodC' |
| 2 | 'prodA' |
+-----------+-----------+
And to Store the survey result data, I have to put lots of duplicated information between rows of record:
User Answer table
+----------+------------+----------+-----------+-----------+--------+-----------+
| AnswerId | QuestionId | PeriodId | UserId/Ip | ProductId | Answer | VersionId |
+----------+------------+----------+-----------+-----------+--------+-----------+
| 1 | 1 | 1 | 'adam' | 'prodA' | 'Yes' | 2 |
| 2 | 2 | 1 | 'Joe' | 'prodA' | 'Yes' | 2 |
| 3 | 1 | 2 | 'adam' | 'prodB' | 'A' | 3 |
+----------+------------+----------+-----------+-----------+--------+-----------+
We're expecting tens of products and thousands of users for this system.
So assume 30 products, 5000 users, 50 questions per survey and 4 surveys per year
in the current design, there'll be 5000 * 4 * 50 * 30 = 30 millions of records added in the User Answer Table per year,
I'm really afraid if it could still work properly..., so any suggestions for optimizing?
Edit 1:
Add VersionId column in user answer table as suggested.
This looks like a case of premature optimization. You should probably worry more about correctness and flexibility than performance.
30 million rows per year, especially in these skinny tables, is a small amount of data for any Oracle system. Don't worry too much about indexes and partitioning yet, those can be added later if necessary.
Your solution is similar to the Entity Attribute Value (EAV) model. It's worth knowing that term since much has been written about it. There are 2 common problems with EAV models you want to avoid:
Avoid extremes. Don't use EAV for everything, but don't completely avoid it either. EAV is slow and inconvenient compared to a normal table structure. It should not be used for every interesting columns, otherwise you have created a database within a database. For example, if virtually every survey has fields like a username and a date created, store those as regular columns and not in a generic column. It's OK to have a column that is only populated 99% of the time. On the other hand, it's a bad idea to always avoid the EAV and try to hack something together with 1,000 column tables or object-relational types.
Always use the correct type. Always, always, always store data as the correct type. Store numbers as numbers, dates as dates, and strings as strings. Your queries will be easier, faster, and safer, if you have at least three columns for the data: ANSWER_NUMBER, ANSWER_STRING, ANSWER_DATE. I explain the type safety problem more in this answer. Those extra columns may look bad in the model diagram, but they are a life-saver when you're querying the data.

How to make a query for getting the specific rows with the latest time column value

Below is my sample data, I would like to get the host:value pair with the latest time.
+------+-------+-------+
| HOST | VALUE | TIME |
+------+-------+-------+
| A | 100 | 13:40 |
| A | 150 | 13:00 |
| A | 222 | 13:23 |
| B | 210 | 13:55 |
| B | 300 | 13:44 |
+------+-------+-------+
Wanted to get only rows with the latest time value for the each host column value.
The result should be like:
A 150 13:40
B 210 13:55
I think there are several analytical function to achieve this requirement in Oracle but I'm not sure what can I do in SQLite.
Can you let me know how I can make a query?
Here is an ANSI-compliant way of performing your query which should run on all versions of SQLite. For a potentially shorter solution see the answer by #CL.
SELECT t1.HOST || '-' || t1.VALUE || '-' || t1.TIME AS HOSTVALUETIME
FROM table t1 INNER JOIN
(
SELECT HOST, MAX(TIME) AS MAXTIME
FROM table
GROUP BY HOST
) t2
ON t1.HOST = t2.HOST AND t1.TIME = t2.MAXTIME
ORDER BY t1.HOST DESC
Output:
+---------------+
| HOSTVALUETIME |
+---------------+
| A-100-13:50 |
| B-210-13:55 |
+---------------+
In SQLite 3.7.11 or later, MAX() selects from which row in a group the other column values come:
SELECT Host,
Value,
MAX(Time)
FROM TheNameOfThisTableIsSoSecretThatICantTellYou
GROUP BY Host;

SQLite3: dynamic between query

I have this sqlite3 table (simplified):
+--------+----------+-------+
| ROUTE | WPNumber | WPID |
+--------+----------+-------+
| A123 | 1 | WP001 |
| A123 | 2 | WP002 |
| A123 | 3 | WP003 |
| [...] | [...] | [...] |
| A123 | 20 | WP020 |
+--------+----------+-------+
Lets say I want to travel this route in the reverse direction (020 to 001).
How do I get all the WPID's in between? I know it is possible to build a query using BETWEEN and DESC, but then I'd have to build two seperate queries and have Python check when to use which query. Is it possible to have sqlite3 do the work, independent of the direction (reverse or not).
You can reverse the sorting order by reversing the number used in the ORDER BY clause.
Set the parameter ? to either 1 or -1:
SELECT WPID
FROM ThisTable
WHERE ROUTE = 'A123'
ORDER BY WPNumber * ?
If you would just use a similar query with DESC, the database would have a better opportunity to optimize the sorting with an index.

Resources