I have two views in Oracle SQL, one with 400 million records and the other with 100 million records.
I have a performance problem with the query and I think it's because of the views.
In the tables that use the views, I have created several indices, my question is: Do the views that use the queries inherit the indexes created in the source tables?
For a non materialized view, my understanding is that running it will just query the underlying tables, and, if those tables have indices, they would be used.
Generally speaking, a view's performance depends on the underlying tables which it uses.
A materialized view can also have an index, but you would need to define it.
Related
If each of my database's an overview has only two types (state: pending, appended), is it efficient to designate these two types as partition keys? Or is it effective to index this state value?
It would be more effective to use a sparse index. In your case, you might add an attribute called isPending. You can add this attribute to items that are pending, and remove it once they are appended. If you create a GSI with tid as the hash key and isPending as the sort key, then only items that are pending will be in the GSI.
It will depend on how would you search for these records!
For example, if you will always search by record ID, it never minds. But if you will search every time by the set of records pending, or appended, you should think in use partitions.
You also could research in this Best practice guide from AWS: https://docs.aws.amazon.com/en_us/amazondynamodb/latest/developerguide/best-practices.html
Updating:
In this section of best practice guide, it recommends the following:
Keep related data together. Research on routing-table optimization
20 years ago found that "locality of reference" was the single most
important factor in speeding up response time: keeping related data
together in one place. This is equally true in NoSQL systems today,
where keeping related data in close proximity has a major impact on
cost and performance. Instead of distributing related data items
across multiple tables, you should keep related items in your NoSQL
system as close together as possible.
As a general rule, you should maintain as few tables as possible in a
DynamoDB application. As emphasized earlier, most well designed
applications require only one table, unless there is a specific reason
for using multiple tables.
Exceptions are cases where high-volume time series data are involved,
or datasets that have very different access patterns—but these are
exceptions. A single table with inverted indexes can usually enable
simple queries to create and retrieve the complex hierarchical data
structures required by your application.
Use sort order. Related items can be grouped together and queried
efficiently if their key design causes them to sort together. This is
an important NoSQL design strategy.
Distribute queries. It is also important that a high volume of
queries not be focused on one part of the database, where they can
exceed I/O capacity. Instead, you should design data keys to
distribute traffic evenly across partitions as much as possible,
avoiding "hot spots."
Use global secondary indexes. By creating specific global secondary
indexes, you can enable different queries than your main table can
support, and that are still fast and relatively inexpensive.
I hope I could help you!
In Oracle, can we create materialized views in which the defining query is based on a non-partitioned table? I did not found some examples related to this..
Thanks.
Yes, you can have a materialized view without partitioning.
SQL> create materialized view test_mv as select 1 a from dual;
Materialized view created.
SQL> select * from test_mv;
A
----------
1
Many data warehousing concepts like partitioning, parallelism, and materialized views often work well together. And in the past there were cases where you needed to combine them to use either one at all. But now they can all work independently, with a few rare exceptions.
I have an ASP.NET data entry application that is used by multiple clients. The application consists of multiple data entry modules that are common to all clients.
I now have multiple clients that want their own custom module added which will typically consist of a dozen or so data points. Some values will be text, others numeric, some will be dropdown selections, etc.
I'm in need of suggestions for handling the data model for this. I have two thoughts on how to handle. First would be to create a new table for each new module for each client. This is pretty clean but I don't particular like it. My other thought is to have one table with columns for each custom data point for each client. This table would end up with a lot of columns and a lot of NULL values. I don't really like either solution and suspect there's a better way to do this, so any feedback you have will be appreciated.
I'm using SQL Server 2008.
As always with these questions, "it depends".
The dreaded key-value table.
This approach relies on a table which lists the fields and their values as individual records.
CustomFields(clientId int, fieldName sysname, fieldValue varbinary)
Benefits:
Infinitely flexible
Easy to implement
Easy to index
non existing values take no space
Disadvantage:
Showing a list of all records with complete field list is a very dirty query
The Microsoft way
The Microsoft way of this kind of problem is "sparse columns" (introduced in SQL 2008)
Benefits:
Blessed by the people who design SQL Server
records can be queried without having to apply fancy pivots
Fields without data don't take space on disk
Disadvantage:
Many technical restrictions
a new field requires DML
The xml tax
You can add an xml field to the table which will be used to store all the "extra" fields.
Benefits:
unlimited flexibility
can be indexed
storage efficient (when it fits in a page)
With some xpath gymnastics the fields can be included in a flat recordset.
schema can be enforced with schema collections
Disadvantages:
not clearly visible what's in the field
xquery support in SQL Server has gaps which makes getting your data a real nightmare sometimes
There are maybe more solutions, but to me these are the main contenders. Which one to choose:
key-value seems appropriate when the number of extra fields is limited. (say no more than 10-20 or so)
Sparse columns is more suitable for data with many properties which are filled out infrequent. Sounds more appropriate when you can have many extra fields
xml column is very flexible, but a pain to query. Appropriate for solutions that write rarely and query rarely. ie: don't run aggregates etc on the data stored in this field.
I'd suggest you go with the first option you described. I wouldn't over think it. The second option you outlined would be a bad idea in my opinion.
If there are fields common to all the modules you're adding to the system you should consider keeping those in a single table then have other tables with the fields specific to a particular module related back to the primary key in the common table. This is basically table inheritance (http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server) and will centralize the common module data and make it easier to query across modules.
In an n-tiered application where you are using custom entities, how do you find yourself handling data needed from lookup tables? Do you create entities for each of these lookup tables or employ some other strategy?
For example. I have a "Ratings" lookup table that will be used to populate a dropdownlist. Would you create a ratings object with a ratingid and rating property and pass that to your UI or is there a more efficient way to go about it?
Appreciate your thoughts.
I'd suggest that the solution will depend on how often the lookup data changes, whether or not it needs to be editable, and whether or not you're enforcing referential integrity at the database. I think it makes the schema more understandable if you put each lookup type into a separate table.
I generally don't create entities for each lookup table, but instead will load most of the common lookups into structures that are easily re-used by the application - for an asp.net app, for example, I'll create hashtables or ordered dictionaries which can easily be bound to most web controls.
And, horror of horrors, I sometimes create a singleton to manage access to all these lookups, which can be stored as static vars or in the cache, depending on requirements.
We seperate out the different look up types into different objects. It seems to be a little more work up front, but it provides us the ability to make changes to each individual object when we need to, such as an addition of additional information to an object.
I am trying to use the ASP.NET Dynamic Data features to generate CRUD scaffolding for my data model. My model contains supertype/subtype relationships, so some logical entities are split between two tables: one for the generic data and one for the subtype-specific data.
In the LINQ context I expose these entities as a single class, backed by a view that joins the tables together. I have also created sprocs for insert/update/delete and configured the class behaviour to use them.
When I turn on scaffolding, it only generates a read-only view of this data. The add, edit and remove links don't show up. Why?
SOLVED: The problem was that I did not identify a Primary Key column on the LINQ classes after dragging the views onto the surface. After adding a PK the CRUD functions showed up.