I need to enlarge the varchar of one column from 64 to 80. The table is quite big(9m rows). One point that make the alter uncertain is that the column is also indexed.
So if I alter the column, since the column is one of the indexes, will any locking happen for row or table?
Thanks.
Not sure about locking part. but standard recommendation would be to drop the index first and then alter table and again create the index.
drop index [[index name]]
go
alter table t1 [[alter column]]
go
create index [[index name]]
go
Related
I have a table that is actually a ranking list. I want to give user a chance to rearrange that top the way he wants, ergo, allow him to move the rows in that table. Should I create a separate column that would hold the place, or can it be done using embedded order in table?
The documentation says:
If a SELECT statement that returns more than one row does not have an ORDER BY clause, the order in which the rows are returned is undefined.
(This is true for all SQL databases.)
So you cannot rely on the order that the rows happen to be stored in; you have to use some value in some table column.
How to Enable/Disable Indexes in TERADATA Database?
I want to disable indexes and do update and then enable the indexes in Teradata.
If Enable/Disable option not available in Teradata, in the sense How can I achieve this ? If I use DROP Indexes, how can I recreate the indexes for all the tables?
Teradata gives you a way to create a table without choosing for primary index (if you are sure of any column).
You can create table with No primary Index. Here is a sample of how to do so:
Create table <table name>
(<columnname> <datatype>,
<columnname> <datatype>)
no primary index ;
Teradata does not have a disable index feature.
All tables have a Primary Index (PI) which is chosen by the RDBMS unless you specify one.
CREATE INDEX <index name> (<column list>) ON table name;
CREATE UNIQUE INDEX (department) ON tbl_employee;
DROP INDEX ind_dept ON tbl_employee;
DROP INDEX (department,emp_number) ON tbl_employee;
In Teradata you can't drop Primary index of a table. The primary index defines where data will reside and which AMP receives the row.
To alter the primary index of a table you need to delete all the records from the table ( As data is already distributed by the row hash value of the PI) then only you can change the primary index of a table by using below command:-
Alter table table_name modify primary index index_name (column list);
Steps to achieve your goal
You can crate a new table with your desired index ( temp, wrk, intermediate table) insert the records from the original table and update the wrk table.
delete the original table and insert the wrk table data.
And you are done.
Create & Drop index is the only option you have here
Simple answer - you can't disable and re-enable indexes in Teradata.
However, there some workarounds.
Drop Index
If you talking about PI (primary index) - you can't drop it. All you can do is to make a copy of a table without an index.
You can drop secondary index though. Then simply create it again when you need it.
Drop-Create Table
This doesn't fit all cases, but often this is the fastest way to do the work, especially if the issue you have is with PI.
BTW: it is not clear, why would you need to do that? Performance or logic or something else? That probably will affect recommendation.
Since you did not specify what kind of index you want to disable/enable, below are the approach you can follow on either cases.
PRIMARY INDEX
CREATE a new table with the same PI
INSERT the updated data to new table
DROP the old table - DROP TABLE <OldTable>;
RENAME the new table same as the old one. - RENAME TABLE <NewTable> TO <OldTable>;
Above recommendation for Primary Index is only applicable if you are going to update a primary index column value. If you will update other columns (not the PI column) then you can just issue an UPDATE Statement directly.
SECONDARY INDEX
DROP the SI - DROP INDEX <IndexName> ON <TableName>;
UPDATE table data
RECREATE the SI - CREATE INDEX <IndexName> (<ColumnList>) ON <TableName>;
I just want to clarify: if you insert a row to a table in sqlite, it appends it to the table, but -- as I learned -- the table is unordered, so there is really not true way to insert a row into the middle of an "ordered table," right?
Is there even a way to make an ordered table without first created a table and then using '...ORDER BY name/id/etc' (i.e. when you insert something it puts itself in the right place)?
SQLite tables are actually stored in rowid order, but this is unlikely to help you because it is unlikely that there is a gap where you want it.
Furthermore, the order in which rows are stored does not matter because there is no guarantee that this is the order in which they are returned.
When you want to SELECT rows in a specific order, you must use ORDER BY.
If your query is too slow, an index on the sorting column might help.
I have a sqlserver table with the usual
intID(primary key),field1,field2,manyotherfields..., datetime TimeOperation
99% of my different kind of queries start with a TimeOperation BETWEEN startTime AND endTime, and then select * (or count(*)) where fieldA=xxx, and join with other smaller tables.
select * because more or less I need all the fields.
I obviusly created an index on TimeOperation ... but performance are not good enough, so I want to add some index key columns or index included columns, but I'm a little bit confused.
I get the difference between the two, but I don't get how much adding a column in each case impacts on speed and on size.
I guess that the biggest improvement would be to create an index including ALL the columns, is it right? (but I can't afford it in terms of space)
And if I often use field1=xxx for example, adding field1 to the index key columns (after TimeOperation) would give better performance right?
Also...just to be sure how an index with included columns works: if I select rows with TimeOperation in a certain range, sql seeks my TimeOperation index for the rows I'm interested in, and it is faster than scanning all the table because in the index the TimeOperation values are in ascending order, is it right? But then I need all the data now I need all the rest of the data fields of those rows...how does sql acts to retrieve the data? I guess it has a sort of bookmark to those rows in the index, right? But it has to hit the table multiple times then... so including all the columns in the index will save the time to hit the table, it it correct?
Thanks!
Mattia
We will need more information on your table examples of your queries to address this fully, but:
DateTime columns should be highly selective by themselves, so an index with TimeOperation as the first column should address the bulk of queries against TimeOperation.
Do not add all columns blindly to an index, or even on included indexes - this will make the index page density worse and be counter productive (you would be duplicating your table in an index).
If all data in your database centres around TimeOperation, you might consider building your clustered index around it.
If you have queries just on field1 = x then you need a separate index just for field1 (assuming that it is suitably selective), i.e. no TimeOperation on the index if its not in the WHERE clause of your query.
Yes, you are right, when SQL locates a record in an index, it needs to do a key (or RID) lookup back into the cluster to retrieve the rest of the columns. If your non clustered index Includes the other columns in your select statement, the lookup can be avoided. But since you are using SELECT(*), covering indexes are unlikely to help .
Edit
Explanation - Selectivity and density are explained in detail here. e.g. iff your queries against TimeOperation return only a small number of rows (rule of thumb is < 5%, but this isn't always), will the index be used, i.e. your query is selective enough for SQL to choose the index on TimeOperation.
The basic starting point would be:
CREATE TABLE [MyTable]
(
intID INT ID identity(1,1) NOT NULL,
field1 NVARCHAR(20),
-- .. More columns, which may be selected, but not filtered
TimeOperation DateTime,
CONSTRAINT PK_MyTable PRIMARY KEY (IntId)
);
And the basic indexes will be
CREATE NONCLUSTERED INDEX IX_MyTable_1 ON [MyTable](TimeOperation);
CREATE NONCLUSTERED INDEX IX_MyTable_2 ON [MyTable](Field1);
Clustering Consideration / Option
If most of your records are inserted in 'serial' ascending TimeOperation order, i.e. intId and TimeOperation will both increase in tandem, then I would leave the clustering on intID (the default) (i.e. table DDL is PRIMARY KEY CLUSTERED (IntId), which is the default anyway).
However, if there is NO correlation between IntId and TimeOperation, and IF most of your queries are of the form SELECT * FROM [MyTable] WHERE TimeOperation between xx and yy then CREATE CLUSTERED INDEX CL_MyTable ON MyTable(TimeOperation) (and changing PK to PRIMARY KEY NONCLUSTERED (IntId)) should improve this query (Rationale: since contiguous times are kept together, fewer pages need to be read, and the bookmark lookup will be avoided). Even better, if values of TimeOperation are guaranteed to be unique, then CREATE UNIQUE CLUSTERED INDEX CL_MyTable ON MyTable(TimeOperation) will improve density as it will avoid the uniqueifier.
Note - for the rest of this answer, I'm assuming that your IntId and TimeOperations ARE strongly correlated and hence the clustering is by IntId.
Covering Indexes
As others have mentioned, your use of SELECT (*) is bad practice and inter alia means covering indexes won't be of any use (the exception being COUNT(*)).
If your queries weren't SELECT(*), but instead e.g.
SELECT TimeOperation, field1
FROM
WHERE TimeOperation BETWEEN x and y -- and returns < 5% data.
Then altering your index on TimeOperation to include field1
CREATE NONCLUSTERED INDEX IX_MyTable ON [MyTable](TimeOperation) INCLUDE(Field1);
OR adding both to the index (with the most common filter first, or the most selective first if both filters are always present)
CREATE NONCLUSTERED INDEX IX_MyTable ON [MyTable](TimeOperation, Field1);
Either will avoid the rid / key lookup. The second (,) option will address your query where BOTH TimeOperation and Field1 are filtered in a WHERE or HAVING clause.
Re : What's the difference between index on (TimeOperation, Field1) and separate indexes?
e.g.
CREATE NONCLUSTERED INDEX IX_MyTable ON [MyTable](TimeOperation, Field1);
will not be useful for the query
SELECT ... FROM MyTable WHERE Field1 = 'xyz';
The index will only be useful for the queries which have TimeOperation
SELECT ... FROM MyTable WHERE TimeOperation between x and y;
OR
SELECT ... FROM MyTable WHERE TimeOperation between x and y AND Field1 = 'xyz';
Hope this helps?
An index, at its most basic, creates a layer of the "hypertree" structure behind the scenes, which allows the SQL engine to more easily find rows with particular values for indexed columns. Each index creates a different way to "drill down" into the table's data using a binary search (logN performance). Each index you add makes selecting by that index faster, at the cost of slowing insertions/updates (the data must be put in and then indexes must be created).
An index, therefore, should normally be created for combinations of columns that are commonly used to filter records. I would indeed create an index on TimeOperation, and TimeOperation alone.
NEVER simply create an index including all columns of a table, especially a wide one such as this.
Ok I have a sqlite db, that has roughly 100 rows. It is kind of a strange thing that I'm trying to do, but I need to insert a new row between each of the existing rows.
I have been trying to use the Insert statement as follows, but haven't had any luck:
insert into t1(column1) values("hello") where id%2 == 0
So I'm basically trying to use the %-operator to tell me if the id is even or odd. For every even id number, I'd like to insert a new row.
What am I missing? What can I do differently? How can I insert a new row into every other row and have the index updated as well?
Thanks
Your question assumes that the rows have some kind of built-in order to them, and that you can insert rows between other rows. That's not true.
It is true that rows have an order on disk, and that the id column is usually assigned in order, but that's an implementation detail. When you perform a query, the database is free to return the rows in any order it chooses, unless you specify what you want with an ORDER BY clause.
Now, I'm assuming what you really want is to insert rows between the existing rows in id order. One way to get what you want would look like this:
UPDATE t1 SET id = id * 2
INSERT INTO t1 (id, column) SELECT id+1, "hello" FROM t1
The UPDATE would double the ids of all the existing rows (so 1,2,3 becomes 2,4,6); then the INSERT would perform a query on t1 and use the result to insert a new set of rows with id values one more than the existing rows (so 2,4,6 becomes 3,5,7).
I haven't tested the above statements, so I don't know if they would work or if they require some extra trickery (like a temporary table) since we are querying and updating the same table in one statement. Also I may have made a syntax error.
Don't consider the rows as pre-ordered in the database. A database will store them as they come in, or according to an index. It's your task to order them on retrieval (i.e. when you query for data) according to your needs.