This question concerns to Oracle DB, so if there are general answers I would like to know. As I am discarding information from Derby/MySQL and other DBs regarding this subject.
Let's say I have several queries using the following columns on its WHERE clause:
Column | Cardinality | Selectivity
_______|__________________________
A | low | low
B | high | low
C | low | low
D | high | high
E | low | low
F | low | low
-- Queries
SELECT * FROM T WHERE A=? AND B=?
SELECT * FROM T WHERE A=? AND B=? AND C=?
SELECT * FROM T WHERE A=? AND C=?
SELECT * FROM T WHERE A=? AND C=? AND D=?
SELECT * FROM T WHERE A=? AND E=? AND F=?
Is there any benefit from pairing these columns (taking into account cardinality mixing) as composite indexes? If so, what is the logic to follow?
I have understood this explanation but it is for SQL Server and it may behave differently.
Is it worthwhile to do covering indexes instead of individual small composite indexes?
Does it matter the column order of composite indexes? i.e:
-- Regardless the column order on the table creation.
CREATE INDEX NDX_1 ON T (A, C);
-- Versus:
CREATE INDEX NDX_1 ON T (C, A);
Would this index be useful?
CREATE INDEX NDX_2 ON T(E, F); -- (low + low) Ignoring 'A' column.
A few things and bear in mind these are generalities
Generally you can only use the leading parts of an index. So looking
at your examples
If you have an index on ( A, B, C ) and you have a predicate on A and
C, then only the index on A can be used. Now there are some cases
where the non-leading part of an index can be used; you will see
this in an execution plan as a SKIP-SCAN operation, but they are
often sub-optimal. So you may want to have (A, C) and ( C, A )
A covering index can be useful, if you are not projecting columns other than those in the index.
Again generally, you do not usually want or need an index if the column has low selectivity. However, it's possible that you have two columns that individually have low selectivity, but have high selectivity when used in combination. (In fact, this is the premise of a bitmap index / star transformation in a dimensional model).
If a multi-column index is useful you may want to put the column with the lowest selectivity first and enable index compression. Index compression can save a huge amount of space in some cases and has very little CPU overhead.
Finally, a SQL Monitor report will help you optimizing a sql statement when it comes to running it.
The minimum number indexes to optimally handle all 5 cases:
(A, B, C) -- in exactly this order
(A, C, D) -- in exactly this order
(A, E, F) -- in any order
If you add another SELECT, all bets are off.
When to have (A, C) and (C, A)?...
Each handles the case where only the first column is being used.
The former is optimal for WHERE A=1 AND C>5; the latter is not. (Etc) Note: = versus some kind of "range" test matters.
When designing indexes for a table, first write out all the queries.
More discussion: Higher cardinality column first in an index when involving a range?
Related
I'm not a user of SPARK. I'm just trying to understand the capabilities of the language.
Can SPARK be used to prove, for example, that Quicksort actually sorts the array given to it?
(Would love to see an example, assuming this is simple)
Yes, it can, though I'm not particularly good at SPARK-proving (yet). Here's how quick-sort works:
We note that the idea behind quicksort is partitioning.
A 'pivot' is selected and this is used to partition the collection into three groups: equal-to, less-than, and greater-than. (This ordering impacts the procedure below; I'm using this because it's different than the in-order version to illustrate that it is primarily about grouping, not ordering.)
If the collection is 0 or 1 in length, then you are sorted; if 2 then check and possibly-correct the ordering and they are sorted; otherwise continue on.
Move the pivot to the first position.
Scan from the second position position to the last position, depending on the value under consideration:
Less – Swap with the first item in the Greater partition.
Greater – Null-op.
Equal — Swap with the first item of Less, the swap with the first item of Greater.
Recursively call on the Less & Greater partitions.
If a function return Less & Equal & Greater, if a procedure re-arrange the in out input to that ordering.
Here's how you would go about doing things:
Prove/assert the 0 and 1 cases as true,
Prove your handling of 2 items,
Prove that given an input-collection and pivot there are a set of three values (L,E,G) which are the count of the elements less-than/equal-to/greater-than the pivot [this is probably a ghost-subprogram],
Prove that L+E+G equals the length of your collection,
Prove [in the post-condition] that given the pivot and (L,E,G) tuple, the output conforms to L items less-than the pivot followed by E items which are equal, and then G items that are greater.
And that should do it. [IIUC]
As InnoDB organizes its data in B+ trees. The height of the tree affects the count of IO times which may be one of the main reasons that DB slows down.
So my question is how to predicate or calculate the height of the B+ tree (e.g. based on the count of pages which can be calculated by row size, page size, and row number), and thus to make a decision whether or not to partition the data to different masters.
https://www.percona.com/blog/2009/04/28/the_depth_of_a_b_tree/
Let N be the number of rows in the table.
Let B be the number of keys that fit in one B-tree node.
The depth of the tree is (log N) / (log B).
From the blog:
Let’s put some numbers in there. Say you have a billion rows, and you can currently fit 64 keys in a node. Then the depth of the tree is (log 109)/ log 64 ≈ 30/6 = 5. Now you rebuild the tree with keys half the size and you get log 109 / log 128 ≈ 30/7 = 4.3. Assuming the top 3 levels of the tree are in memory, then you go from 2 disk seeks on average to 1.3 disk seeks on average, for a 35% speedup.
I would also add that usually you don't have to optimize for I/O cost, because the data you use frequently should be in the InnoDB buffer pool, therefore it won't incur any I/O cost to read it. You should size your buffer pool sufficiently to make this true for most reads.
Simpler computation
The quick and dirty answer is log base 100, rounded up. That is, each node in the BTree has about 100 leaf nodes. In some circles, this is called fanout.
1K rows: 2 levels
1M rows: 3 levels
billion: 5 levels
trillion: 6 levels
These numbers work for "average" rows or indexes. Of course, you could have extremes of about 2 or 1000 for the fanout.
Exact depth
You can find the actual depth from some information_schema:
For Oracle's MySQL:
$where = "WHERE ( ( database_name = ? AND table_name = ? )
OR ( database_name = LOWER(?) AND table_name = LOWER(?) ) )";
$sql = "SELECT last_update,
n_rows,
'Data & PK' AS 'Type',
clustered_index_size * 16384 AS Bytes,
ROUND(clustered_index_size * 16384 / n_rows) AS 'Bytes/row',
clustered_index_size AS Pages,
ROUND(n_rows / clustered_index_size) AS 'Rows/page'
FROM mysql.innodb_table_stats
$where
UNION
SELECT last_update,
n_rows,
'Secondary Indexes' AS 'BTrees',
sum_of_other_index_sizes * 16384 AS Bytes,
ROUND(sum_of_other_index_sizes * 16384 / n_rows) AS 'Bytes/row',
sum_of_other_index_sizes AS Pages,
ROUND(n_rows / sum_of_other_index_sizes) AS 'Rows/page'
FROM mysql.innodb_table_stats
$where
AND sum_of_other_index_sizes > 0
";
For Percona:
/* to enable stats:
percona < 5.5: set global userstat_running = 1;
5.5: set global userstat = 1; */
$sql = "SELECT i.INDEX_NAME as Index_Name,
IF(ROWS_READ IS NULL, 'Unused',
IF(ROWS_READ > 2e9, 'Overflow', ROWS_READ)) as Rows_Read
FROM (
SELECT DISTINCT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
FROM information_schema.STATISTICS
) i
LEFT JOIN information_schema.INDEX_STATISTICS s
ON i.TABLE_SCHEMA = s.TABLE_SCHEMA
AND i.TABLE_NAME = s.TABLE_NAME
AND i.INDEX_NAME = s.INDEX_NAME
WHERE i.TABLE_SCHEMA = ?
AND i.TABLE_NAME = ?
ORDER BY IF(i.INDEX_NAME = 'PRIMARY', 0, 1), i.INDEX_NAME";
(Those give more than just the depth.)
PRIMARY refers to the data's BTree. Names like "n_diff_pfx03" refers to the 3rd level of the BTree; the largest such number for a table indicates the total depth.
Row width
As for estimating the width of a row, see Bill's answer. Here's another approach:
Look up the size of each column (INT=4 bytes, use averages for VARs)
Sum those.
Multiply by between 2 and 3 (to allow for overhead of InnoDB)
Divide into 16K to get average number of leaf nodes.
Non-leaf nodes, plus index leaf nodes, are trickier because you need to understand exactly what represents a "row" in such nodes.
(Hence, my simplistic "100 rows per node".)
But who cares?
Here's another simplification that seems to work quite well. Since disk hits are the biggest performance item in queries, you need to "count the disk hits" as the first order of judging the performance of a query.
But look at the caching of blocks in the buffer_pool. A parent node is 100 times as likely to be recently touched as the child node.
So, the simplification is to "assume" that all non-leaf nodes are cached and all leaf nodes need to be fetched from disk. Hence the depth is not nearly as important as how many leaf node blocks are touched. This shoots down your "35% speedup" -- Sure 35% speedup for CPU, but virtually no speedup for I/O. And I/O is the important component.
Note that if you fetching the latest 20 rows of a table that is chronologically stored, they will be found in the last 1 (or maybe 2) blocks. If they are stored by a UUID, it is more likely to tale 20 blocks -- many more disk hits, hence much slower.
Secondary Indexes
The PRIMARY KEY is clustered with the data. That implies that a look by the PK needs to drill down one BTree. But a secondary index is implemented by a second BTree -- drill down it to find the PK, then drill down via the PK. When "counting the disk hits", you need to consider both BTrees. And consider the randomness (eg, for UUIDs) or not (date-ordered).
Writes
Find the block (possibly cached)
Update it
If necessary, deal with a block split
Flag the block as "dirty" in the buffer_pool
Eventually write it back to disk.
Step 1 may involve a read I/O; step 5 may involve a write I/O -- but you are not waiting for it to finish.
Index updates
UNIQUE indexes must be checked before finishing an INSERT. This involves a potentially-cached read I/O.
For a non-unique index, an entry in the "Change buffer" is made. (This lives in the buffer_pool.) Eventually that is merged with the appropriate block on disk. That is, no waiting for I/O when INSERTing a row (at least not waiting to update non-unique indexes).
Corollary: UNIQUE indexes are more costly. But is there really any need for more than 2 such indexes (including the PK)?
I have a typical friend of friend graph database i.e. a social network database. The requirement is to extract all the nodes as a list in such a way that the least connected nodes appear together in the list and the most connected nodes are placed further apart in the list.
Basically its asking a graph to be represented as a list and I'm not sure if we can really do that. For e.g. if A is related to B with strength 10, B is related to C with strength 80, A to C is 20
then how to place this in a list ?
A, B, C - no because then A is distant from C relatively more than B which is not the case
A, C, B - yes because A and B are less related that A,C and C,B.
With 3 nodes its very simple but with lot of nodes - is it possible to put them in a list based on relationship strength ?
Ok, I think this is maybe what you want. An inverse of the shortestPath traversal with weights. If not, tell me how the output should be.
http://console.neo4j.org/r/n8npue
MATCH p=(n)-[*]-(m) // search all paths
WHERE n <> m
AND ALL (x IN nodes(p) WHERE length([x2 IN nodes(p) WHERE x2=x])=1) // this filters simple paths
RETURN [n IN nodes(p)| n.name] AS names, // get the names out
reduce(acc=0, r IN relationships(p)| acc + r.Strength) AS totalStrength // calculate total strength produced by following this path
ORDER BY length(p) DESC , totalStrength ASC // get the max length (hopefully a full traversal), and the minimum strength
LIMIT 1
This is not going to be efficient for a large graph, but I think it's definitely doable--probably needs using the traversal/graphalgo API shortest path functionality if you need speed on a large graph.
The 6.3.6 Vectors section in the Scheme R5RS standard states the following about vectors:
Vectors are heterogenous structures whose elements are indexed by integers. A vector typically occupies less space than a list of the same length, and the average time required to access a randomly chosen element is typically less for the vector than for the list.
This description of vectors is a bit diffuse.
I'd like to know what this actually means in terms of the vector-ref and list-ref operations and their complexity. Both procedures returns the k-th element of a vector and a list. Is the vector operation O(1) and is the list operation O(n)? How are vectors different than lists? Where can I find more information about this?
Right now I'm using association lists as a data structure for storing key/value pairs for easy lookup. If the keys are integers it would perhaps be better to use vectors to store the values.
The very specific details of vector-ref and list-ref are implementation-dependent, meaning: each Scheme interpreter can implement the specification as it sees fit, so an answer for your question can not be generalized to all interpreters conforming to R5RS, it depends on the actual interpreter you're using.
But yes, in any decent implementation is a safe bet to assume that the vector-ref operation is O(1), and that the list-ref operation is probably O(n). Why? because a vector, under the hood, should be implemented using a data structure native to the implementation language, that allows O(1) access to an element given its index (say, a primitive array) - therefore making the implementation of vector-ref straightforward. Whereas lists in Lisp are created by linking cons cells, and finding an element at any given index entails traversing all the elements before it in the list - hence O(n) complexity.
As a side note - yes, using vectors would be a faster alternative than using association lists of key/value pairs, as long as the keys are integers and the number of elements to be indexed is known beforehand (a Scheme vector can not grow its size after its creation). For the general case (keys other than integers, variable size) check if your interpreter supports hash tables, or use an external library that provides them (say, SRFI 69).
A list is constructed from cons cells. From the R5RS list section:
The objects in the car fields of successive pairs of a list are the elements of the list. For example, a two-element list is a pair whose car is the first element and whose cdr is a pair whose car is the second element and whose cdr is the empty list. The length of a list is the number of elements, which is the same as the number of pairs.
For example, the list (a b c) is equivalent to the following series of pairs: (a . (b . (c . ())))
And could be represented in memory by the following "nodes":
[p] --> [p] --> [p] --> null
| | |
|==> a |==> b |==> c
With each node [] containing a pointer p to the value (it's car), and another pointer to the next element (it's cdr).
This allows the list to grow to an unlimited length, but requires a ref operation to start at the front of the list and traverse k elements in order to find the requested one. As you stated, this is O(n).
By contrast, a vector is basically an array of values which could be internally represented as an array of pointers. For example, the vector #(a b c) might be represented as:
[p p p]
| | |
| | |==> c
| |
| |==> b
|
|==> a
Where the array [] contains a series of three pointers, and each pointer is assigned to a value in the vector. So internally you could reference the third element of the vector v using the notation v[3]. Since you do not need to traverse the previous elements, vector-ref is an O(1) operation.
The main disadvantage is that vectors are of fixed size, so if you need to add more elements than the vector can hold, you have to allocate a new vector and copy the old elements to this new vector. This can potentially be an expensive operation if your application does this on a regular basis.
There are many resources online - this article on Scheme Data Structures goes into more detail and provides some examples, although it is much more focused on lists.
All that said, if your keys are (or can become) integers and you either have a fixed number of elements or can manage with a reasonable amount of vector reallocations - for example, you load the vector at startup and then perform mostly reads - a vector may be an attractive alternative to an association list.
How many different partitions with exactly two parts can be made of the set {1,2,3,4}?
There are 4 elements in this list that need to be partitioned into 2 parts. I wrote these out and got a total of 7 different possibilities:
{{1},{2,3,4}}
{{2},{1,3,4}}
{{3},{1,2,4}}
{{4},{1,2,3}}
{{1,2},{3,4}}
{{1,3},{2,4}}
{{1,4},{2,3}}
Now I must answer the same question for the set {1,2,3,...,100}.
There are 100 elements in this list that need to be partitioned into 2 parts. I know the largest size a part of the partition can be is 50 (that's 100/2) and the smallest is 1 (so one part has 1 number and the other part has 99). How can I determine how many different possibilities there are for partitions of two parts without writing out extraneous lists of every possible combination?
Can the answer be simplified into a factorial (such as 12!)?
Is there a general formula one can use to find how many different partitions with exactly n parts can be made of a set with k-elements?
1) stackoverflow is about programming. Your question belongs to https://math.stackexchange.com/ realm.
2) There are 2n subsets of a set of n elements (because each of n elements may either be or be not contained in the specific subset). This gives us 2n-1 different partitions of a n-element set into the two subsets. One of these partitions is the trivial one (with the one part being an empty subset and other part being the entire original set), and from your example it seems you don't want to count the trivial partition. So the answer is 2n-1-1 (which gives 23-1=7 for n=4).
The general answer for n parts and k elements would be the Stirling number of the second kind S(k,n).
Please beware that the usual convention is with n the total number of elements, thus S(n,k)
Computing the general formula is quite ugly, but doable for k=2 (with the common notation) :
Thus S(n,2) = 1/2 ( (+1) * 1 * 0n +(-1) * 2 * 1n + (+1) * 1 * 2n ) = (0-2+2n)/2 = 2n-1-1