TYPO3 8.7 cf_cache_hash table growing rapidly - typo3-8.x

In a TYPO3 8.7. installation the table cf_cache_hash is growing rapidly to being several GB large. What is the best way to identify what is filling the table rapidly?
The table cf_cache_hash_tags contains a lot of ident_MENUDATA entries.

Related

Storage engine optimization for datawarehousing

We've been deploying a midsize datawarehouse database with daily updates, a few fact tables, many dimensions and even more ondemand reports programmed in a custom build php framework. We've optimized indexes to performance optimal levels.
Now we wonder if moving selectively to different storage engines for fact and dimension tables would help. Most if the tables are in InnoDB, some log tables in CSV. Would it be beneficial to move dimensions to Aria? Fact tables are fairly sized, yet not larger than 200 billion records in size, dimensions smaller than 1000 records. We are happy with performance now but the (fact)data is growing daily.
Any general thoughts?

database lookup expedition using externally provided statistics during initialization

I have a large table that I can not fit to the memory. I am considering moving the table to a database, for example sqlite. However, the some of the lookups will be made very often, therefore they could stay in memory and rest could stay in the hard disk. I want to expedit the table lookup by supplying some statistics during initialization, so that memory lookups to the hard disk will be reduced. I have tried it with sqlite, however, I am not able to supply it a list of external statistics. Any suggestions?
Thank you,

How to change the PI of the existing huge table in Teradata ?

We have a huge table in Teradata holding the transnational data sizing approximately 13 TB.
As a typical old Data Warehouse enhancement case, we have now found that the PI selection made in the past for this table is not the best. And due to this, we are facing lots of performance issues with the SQL's pulling data from this table.
So the very first idea that we wanted to implement is to load the data in temp table, alter the PI of the existing table or create a new table with new PI and load the data from temp table to new \ altered table.
But the challenge here is the table is live and this solution will not be the best due the size of the data movement. Also, other way we thought about delta load (to new table) - delta delete (from main table). But this also can not be the best workaround as the table is live and it will involve too much of efforts in data movement as well as in matching the row counts in source and target table.
I need your ideas on this scenario, how can I change the PI of this table by making small efforts without any system downtime.

Is it possible to prototype server/SQL performance on paper for various loads?

I am trying to figure out whether a web development project is feasible at the moment and have so far learned that the total row count of the proposed database (30 million rows, 5 columns and about 3 gb of storage) is well within the budget limits in terms of storage requirements, but because of the anticipated large number of queries that users will make to the database I am not sure if this will cause an unrealistic load to manage for the server to provide adequate performance (within my budget).
I will be using this grid (a live demo of performance benchmarks for 300,000 rows - http://demos.telerik.com/aspnet-ajax/grid/examples/performance/linq/defaultcs.aspx). Inserting a search term in the "product name" box and pressing enter takes 1.6 seconds from query to results render. It seems to me (a newbie) that 300,000 rows which take 1.6 seconds all in all must take much longer with 30 million rows, and so I am trying to figure out
what the increase in time would be the more rows are added up to 30 million
what the increase in time would be for each additional 1000 people using the search grid at the same time.
what hardware requirements are necessary to reduce the delays to an acceptable level
Hopefully if I can figure that out I can get a more realistic assessment for feasibility. FYI: The database need not be updated very regularly, it is more for readonly purposes.
Can this problem be prototyped on paper for these 3 points?
Even wide ball park estimates- without considering optimisation, am I talking hundreds of dollars for 5000 users to have searches below 10 seconds each, thousands, or tens of thousands of dollars?
[Will be asp.net RadControls for AJAX Grid, One of these cloud hosted servers: 4,096MB RAM
160GB Diskspace, and either Microsoft® SQL Server® 2008 R2 and SQL Server 2012 ]
The database need not be updated very regularly, it is more for readonly purposes.
Your search filters allow for substring searches, so db indexes are not going to help you and the search will go row-by-row.
It looks like your data would probably fit in 5GB of memory or so. I would store the whole thing in memory and seach there.

Database design question: How to handle a huge amount of data in Oracle?

I have over 1.500.000 data entries and it's going to increase gradually over time. This huge amount of data would come from 150 regions.
Now should I create 150 tables to manage this increasing huge data? Will this be efficient? I need fast operation. ASP.NET and Oracle will be used.
If all the data is the same, don't split it in to different tables. Take a look at Oracle's table partitions. One-hundred fifty partitions (or more) split out by region (or more) is probably more in line with what you're going to be looking for.
I would also recommend you look at the Oracle Database Performance Tuning Tips & Techniques book and browse Ask Tom on Oracle's website.
Only 1.5 M rows? Not a lot really...
Use one table; working out how to write a 150-way union across 150 tables will be murder.
1.5 million rows doesn't really seem like that much. How many people are accessing the table(s) at any given point? Do you have any indexes setup? If you expect it to grow much larger, you may want to look into partitioning in databases.
FWIW, I work with databases on a regular basis with 100M+ rows. It shouldn't be this bad unless you have thousands of people using it at a time.
1 table per region is way not normalized; you're probably going to lose a bunch of efficiency there. 1 table per data entry site is pretty unusual too. Normalization is huge, it will save you a ton of time down the road, so I'd make sure you're not storing any duplicate data.
If you're using oracle, you shouldn't need to have multiple tables. It'll support a lot more than 1.5 million rows. If you need to speed up data access, you can try a snowflake schema to pull in commonly accessed data.
If you mean 1,500,000 rows in a table then you do not have much to worry about. Oracle can handle much larger loads than that with ease.
If you need to identify the regions that the data came in, you can create a Region table and tie the ID from that to the big data table.
IMHO, you should post more details and we can help you better.
A database with 2,000 rows can be slow. It all depends on your database design, index, keys and most important is the hardware configuration your database server is running on. The way your application uses this data is also important. Is a read intensive database or transaction intensive? There is no right answer to what you are asking right now.
You first need to consider what operations are going to access the table. How will inserts be performed? Will the existing rows be updated, and if so how? By how much will the rows grow, and what percentage of them will grow? Will rows get deleted? By what criteria? How will you be selecting data? By what criteria and how many per query?
Data partition can be used for volume of data much larger than 1.5m rows. Look into optimizing
the SQL query ,batch processing and storage of data.

Resources