PostgreSQL: What is the maximum column length for CITEXT type - oracle11g

I am migrating my PostgreSQL9.1 schema to Oracle11g. I could not find online what is the maximum column length for CITEXT type column. What size should I put for Oracle when using VARCHAR2 type.
Note: I know CITEXT provides case-insensitive comparison, but, I am not much concerned with that.

According to the citext documentation citext is just a case insensitive version of text. text itself is of unlimited length (well, 1G actually). Therefore you cannot assume any meaningful upper limit. You have to ask the app developers for practical limits of each column.

Related

What is the maximum length of Version in Flyway database deployment scripts

on Flyway deployment script name starts with a version.
What is the maximum length one can use? I see that on the table column holding the version is 50 character long
There are a number of limits:
Version must be 50 characters or less
Description must be 200 characters or less
Migration filenames must be compatible with any OS limit
Do you have a specific use case for a version string longer than 50 characters? We're in the middle of work for Flyway 7 and this is a chance for us to change the history table if there's a good reason to do so.
If you read the documentation located here, you'll find that the limit is not one from Flyway. Rather, your limit on the length of the version is based on the OS and it's limit on the size of a file name. You must ensure that you're incrementing your numbers in an appropriate order. However, as you can see in the docs, Flyway supports a wide variety of formats, and the length of your string defining the version number is not an issue you need to worry about.

What is the effect of FeedOptions.EnableLowPrecisionOrderBy Property

Azure DocumentDB .NET SDK document querying API provide option to reduce order by precision but the exact expected effect remains vague. The documentation states only:
Gets or sets the option to enable low precision order by in the Azure DocumentDB database service.
The ORDER BY-clause documentation does not say a word about ordering behavior depending on FeedOptions or results being sometimes ordered differently than requested in query.
What does the mentioned option actually do?
What precision can we expect when using "low precision"?
What can we assert about the actual order beyond that "low precision"?
Azure Cosmos DB supports varying the precision of your index to reduce the storage footprint of indexing (default is full precision). For example, with a numeric precision of 5, the service would index the first 5 bytes of your number.
By default, sorting a property with a lower precision is disallowed. But you can opt-in to perform an ORDER BY by setting EnableLowPrecisionOrderBy. Let's say you choose a numeric precision of 5 (for an 8-byte number). The query results will be in order based on the most significant 5 bytes of the number, but guarantees no order for the remaining 3 bytes. This option lets you perform fast order by without requiring full precision indexing, for example, when you're performing ad-hoc exploration of data.
For most use cases, you should use the default precision, and not have to rely on the lower precision order by.
EDIT: this is now deprecated, and indexing is by default maximum precision

Using only UTF8 encoding in SQLite, what can I trim out of the ICU dataset?

ICU provides a way of cutting down the size of the .dat file. I'm almost certain I don't need most of the encodings that are default. If I want to build a CJK .dat file specifically for sqlite, which ones can I cut out.
I just need the tokenizer to work and possibly collation. Seems that all those character conversions may not really be necessary. At 17MB, it is too FAT! For all database, we use
PRAGMA encoding = UTF8;
Data Customizer Link: http://apps.icu-project.org/datacustom/
To put it another way, if I'm using UTF8 in SQLite to collate and index, what parts of the dat file do I really need? I bet the majority is never used. I suspect I don't need the Charset Mapping Tables, and maybe not some of the Misc data.
ICU.
This tool will generate a data library that can only be used with the 4.8 series of ICU. The help page provides information on how to use this tool.
Charset Mapping Tables (4585 KB) <-- axe?
Break Iterator (1747 KB) <-- seems like i need this
Collators (3362 KB) <-- seems like i need this for sorting (but maybe not)
Rule Based Number Format (292 KB) <-- axe?
Transliterators (555 KB) <-- axe?
Formatting, Display Names and Other Localized Data (856 KB) <-- axe?
Miscellaneous Data (5682 KB) <-- axe?
Base Data (311 KB) <-- seems basic
Update. It seems that everything can be removed except for Base Data and Break Iterator. Regarding the Collators from http://userguide.icu-project.org/icudata:
The largest part of the data besides conversion tables is in collation
for East Asian languages. You can remove the collation data for those
languages by removing the CollationElements entries from those
source/data/locales/*.txt files. When you do that, the collation for
those languages will become the same as the Unicode Collation
Algorithm.
This seems "good enough".
On Collation
Starting in release 1.8, the ICU Collation Service is updated to be
fully compliant to the Unicode Collation Algorithm (UCA)
(http://www.unicode.org/unicode/reports/tr10/ ) and conforms to ISO
14651. There are several benefits to using the collation algorithms defined in these standards. Some of the more significant benefits
include:
Unicode contains a large set of characters. This can make it difficult
for collation to be a fast operation or require collation to use
significant memory or disk resources. The ICU collation implementation
is designed to be fast, have a small memory footprint and be highly
customizable.
The algorithms have been designed and reviewed by experts in
multilingual collation, and therefore are robust and comprehensive.
Applications that share sorted data but do not agree on how the data
should be ordered fail to perform correctly. By conforming to the
UCA/14651 standard for collation, independently developed
applications, such as those used for e-business, sort data identically
and perform properly.
The ICU Collation Service also contains several enhancements that are
not available in UCA. For example:
Additional case handling: ICU allows case differences to be ignored or
flipped. Uppercase letters can be sorted before lowercase letters, or
vice-versa.
Easy customization: Services can be easily tailored to address a wide
range of collation requirements.
Flexibility: ICU offers both sort key generation and fast incremental
string comparison. It also provides low-level access to collation data
through the collation element iterator (ยง)
Update2. If Break Iterator is removed from the .dat, the following occurs:
sqlite> CREATE VIRTUAL TABLE test USING fts4(tokenize=icu);
sqlite> CREATE VIRTUAL TABLE testaux USING fts4aux(test);
sqlite> .import test.csv test
Error: SQL logic error or missing database
(We're talking about the Data Customizer page.)
I started with the biggest items, and was able to omit these entirely:
Charset mapping tables
Miscellaneous Data
I had to include Collators, but only the languages I was supporting.
I tried to trim Break Iterator, but it broke, so I stopped there. Nothing else is nearly as big.

Long type with SQLite and Zentus Jdbc driver

I am using SQLite in Java code through Zentus.
I need to map Java long primitive type in my database. For that I tried to create tables with the following statement: CREATE TABLE MY TABLE (...., LONG time, ...).
Insertion into the database through Java with Zentus works perfectly but when retrieving the data, always through Java and Zentus, the LONG value is shrinked to 32 bit value.
I tried to query the database directly with SQlite and it works, thus I guess the problem is the JDBC driver.
Did some of you already experienced such issues, and how did you solved it ?
SQLite has 4 primitive types:
Text
Integer
Real
Blob
Some key words are converted to these types, unknown keywords default to Text. The "Integer" type is a bit particular, in that SQLite will only keep the minimum size necessary to record the largest number. If your largest number is smaller than 2^31, it will be recorded on 32 bits. I don't know that it shrinks back if it is expanded to 64 bits then all the values above 2^31 are removed, or if it just stays the same. It will definitely shrink back if the database is vacuumed.
I can suggest keeping a dummy record with a 64 bit value, try and see if JDBC behaves after that.

SQLite dataypes lengths?

I'm completely new to SQLite (actually 5 minutes ago), but I do know somewhat the Oracle and MySql backends.
The question: I'm trying to know the lengths of each of the datatypes supported by SQLite, such as the differences between a bigint and a smallint. I've searched across the SQLite documentation (only talks about affinity, only matters it?), SO threads, google... and found nothing.
My guess: I've just slightly revised the SQL92 specifications, which talk about datatypes and its relations but not about its lengths, which is quite obvious I assume. Yet I've come accross the Oracle and MySql datatypes specs, and the specified lengths are mostly identical for integers at least. Should I assume SQLite is using the same lengths?
Aside question: Have I missed something about the SQLite docs? Or have I missed something about SQL in general? Asking this because I can't really understand why the SQLite docs don't specify something as basic as the datatypes lengths. It just doesn't make sense to me! Although I'm sure there is a simple command to discover the lengths.. but why not writing them to the docs?
Thank you!
SQLite is a bit odd when it comes to field types. You can store any type in any field (I.E. put a blob into an integer field). The way it works for integers is: it depends.
While your application may use a long (64 bits) to store the value, if it is actually <128 then SQLite will only use one byte to store it. If the value is >=128 and <16384 then it will use 2 bytes. The algorithm (as I recall) is that it uses 7 bits of each byte with the 8th bit used to indicate if another byte is needed. This works very well for non-negitive values but causes all negative values to take 9 bytes to store.
In SQLite datatypes don't have lengths, values have lengths. A column you define as TINYINT could hold a BLOB, or visa versa.
I'm completely new to the SQLite documentation, but I found it in less than 30 seconds.
Datatypes In SQLite Version 3
INTEGER. The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
8 byte >> mix size is max: 9223372036854775807

Resources