Summarize Historical Uptime Data - asp.net

I'm finishing my first asp.net web app and I've encountered a difficult problem. The web app is designed to test network devices at various locations across the country and record the response time. A Windows service checks these devices regularly, typically every 1-10 minutes. The results of each check are then recorded in a SQL Server table with this design. (ResponseTime is NULL when the device is down.)
CREATE TABLE [dbo].[DeviceStatuses] (
[DeviceStatusID] INT IDENTITY (1, 1) NOT NULL,
[DeviceID] INT NOT NULL,
[StatusTime] DATETIME NULL,
[ResponseTime] INT NULL,
CONSTRAINT [PK_DeviceStatuses] PRIMARY KEY CLUSTERED ([DeviceStatusID] ASC),
CONSTRAINT [FK_DeviceStatuses_Devices] FOREIGN KEY ([DeviceID]) REFERENCES [dbo].[Devices] ([DeviceID])
);
The service has been running for a couple months, with a minimal number of devices and the table has about 500,000 rows. The client would like to have access to a 3-month rolling downtime summary for each device. Something along the lines of:
Down Times:
12/11/2012 3:20 PM - 3:42 PM
12/20/2012 1:00 AM - 9:00 AM
To the best of my understanding I need to get the StatusTime for the beginning and end of each block of NULL ResponseTimes, for a particular DeviceID of course. I've done several searches on Google and StackOverflow, but haven't found anything that resembles what I'm trying to do. (Maybe I'm not using the right search terms.) My brother, a much more experienced programmer, suggested that I might be able to use a CURSOR in SQL Server, though he acknowledged that CURSOR performance is terrible and it would need to be a scheduled task. Any recommendations?

declare #DeviceStatuses table(
[DeviceStatusID] INT IDENTITY (1, 1) NOT NULL,
[DeviceID] INT NOT NULL,
[StatusTime] DATETIME NULL,
[ResponseTime] INT NULL)
Insert into #DeviceStatuses([DeviceID],[StatusTime],[ResponseTime])
Values
(1,'20120101 10:10',2),(1,'20120101 10:12',NULL),(1,'20120101 10:14',2),
(1,'20120102 10:10',2),(1,'20120102 10:12',NULL),(1,'20120102 10:14',2),
(2,'20120101 10:10',2),(2,'20120101 10:12',NULL),(2,'20120101 10:14',2),
(2,'20120101 10:19',2),(2,'20120101 10:20',NULL),(2,'20120101 10:21',NULL),(2,'20120101 10:22',2),
(2,'20120102 10:10',2),(2,'20120102 10:12',NULL),(2,'20120102 10:14',2);
Select [DeviceID],MIN([StatusTime]) as StartDown,MAX([StatusTime]) as EndDown
from
(
Select [DeviceID],[StatusTime]
,(Select MAX([StatusTime]) from #DeviceStatuses s2 where s2.DeviceID=s1.DeviceID and s2.StatusTime<s1.StatusTime and s2.ResponseTime is not null) as gr
from #DeviceStatuses s1
where s1.ResponseTime is null
)a
Group by [DeviceID],gr
order by [DeviceID],gr

Related

How to conduct a fully synchronous multi-step database operation?

Current project:
DotNet 4.7
MVC 5
C# 7.1
Repository pattern that uses Linq Lambda for CRUD operations
I have a bit of a problem with regards to what might seem like a concurrency issue, but really isn’t.
You see, the system I am building allows a user to register for a class. Each class has a certain capacity, and there needs to be the ability to have people sitting in both Enrolled and Waitlisted states.
The Registration table does this by having an Enrolled boolean, true for Enrolled, False for waitlisted. The problem is, during the registration process, I need a query-count that brings up the number of existing Registered users, sees how many open spots there still are by comparing the Enrolled=yes count to the capacity value of the class (a different table), and if there are any open spots, allows a user to register with the Enrolled flag set to true. If there are no spots left, the user is registered with the Enrolled flag is set to False.
The problem exists when there is one spot left and two users sign up simultaneously (or close enough for the system to be working on one operation at the same time as the other). I have seen timestamps on the existing Registration table, and there are times when two users are very, very close together in having their data entered.
I need a system, either in SQL or in MVC, that will only do one user registration at a time. That there will be no chance whatsoever that it starts a second user registration (the query to see how many Enrolled=yes there are in the Registrations table) until the saving of the first user's registration is done. In no case should the number of Enrolled=yes registrations ever exceed the capacity of the class that the user is enrolling in.
In other words, the actual query-count-compare-decide-record process (at least two touches of the database, one to query, one to record) needs to be absolutely SYNCHRONOUS, and essentially block all other registration attempts until the process is done. Because this will be done in one block of code, I can safely say that the table will never be “locked” for any longer than it actually takes to run the code; this process is after all relevant user interaction. But since this is a read (counting all the current registrations where Enrolled = yes, to determine if current write needs to be Enrolled=yes or =no) followed by a write, I am worried about a second read occurring between the read and the write of the first, and that the second write occurs after the first, leaving the Enrolled=true count in an incompliant state.
Since the number of users who are signing up will only ever be small(ish), I am not overly concerned about performance, but I am stumped on how to actually implement this.
Suggestions?
IMO, the most important thing is, implement the validation in the database, at least. Don't allow the data to violate a rule like this.
I think you should just use an insert/update trigger and block a statement if it violates your constraint. It's the same as implementing a check constraint, but gives you the flexibility to check the class capacity, count the number of students in the class, and throw an error (rolling back the transaction) if you ever exceed the student limit in a course.
The trigger executes in the same transaction as the DML statement that triggers it, so if you THROW an error, you roll it all back. Something like this should work:
CREATE TABLE class (class_id INT, capacity INT)
GO
CREATE TABLE registration (class_id INT, student_id INT, enrolled BIT)
GO
CREATE TRIGGER i_registration
ON registration
FOR INSERT, UPDATE
AS
BEGIN
SET NOCOUNT ON;
IF EXISTS (
SELECT * FROM (
SELECT SUM(CASE WHEN enrolled = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY r.class_id) enrolled, c.capacity
FROM registration r
INNER JOIN class c ON r.class_id = c.class_id
)sq WHERE enrolled > capacity
)
THROW 51000, 'Class is full!', 1
END
GO
And some sample DML statements:
insert into class values (1, 5)
insert into registration values (1, 1, 1)
insert into registration values (1, 2, 1)
insert into registration values (1, 3, 0)
insert into registration values (1, 4, 0)
insert into registration values (1, 5, 1)
insert into registration values (1, 6, 1)
insert into registration values (1, 7, 1)
insert into registration values (1, 8, 1) -- blocked!
update registration set enrolled = 1 where student_id = 3 -- blocked!
When you call SaveChangesAsync() in your client app, it's going to raise an exception (System.Data.Entity.Infrastructure.DbUpdateException), and you'll be able to see your exception message & number by cascading through Inner Exceptions until you find a System.Data.SqlClient.SqlException. You can use these to determine how to present the error message to the user:

Improving SQLite Query Performance

I have run the following query in SQLite and SQLServer. On SQLite the query has never finished runing - i have let it sit for hours and still continues to run. On SQLServer it takes a little less than a minute to run. The table has several hundred thousands of records. Is there a way to improve the performance of the query in SQLite?
update tmp_tbl
set prior_symbol = (select o.symbol
from options o
where o.underlying_ticker = tmp_tbl.underlying_ticker
and o.option_type = tmp_tbl.option_type
and o.expiration = tmp_tbl.expiration
and o.strike = (select max(o2.strike)
from options o2
where o2.underlying_ticker = tmp_tbl.underlying_ticker
and o2.option_type = tmp_tbl.option_type
and o2.expiration = tmp_tbl.expiration
and o2.strike < tmp_tbl.strike));
Update: I was able to get what I needed done using some python code and handling the data mapping outside of SQL. However, I am puzzled by the performance difference between SQLite and SQLServer - I was expecting SQLite to be much faster.
When I ran the above query initially, neither table had any indexes other than a standard primary key, id, which is unrelated to the data. I created two indexes as follows:
create index options_table_index on options(underlying_ticker, option_type, expiration, strike);
and:
create index tmp_tbl_index on tmp_tbl(underlying_ticker, option_type, expiration, strike);
But that didn't help. The query still continues to clock without any output - I let it run for nearly 40 minutes.
The table definition for tmp_tbl is:
create table tmp_tbl(id integer primary key,
symbol text,
underlying_ticker text,
option_type text,
strike real,
expiration text,
mid real,
prior_symbol real,
prior_premium real,
ratio real,
error_flag bit);
The definition of options table is similar but with a few more fields.

Sqlite slow but barely using machine ressources

I have a 500MB sqlite database of about 5 million rows with the following schema:
CREATE TABLE my_table (
id1 VARCHAR(12) NOT NULL,
id2 VARCHAR(3) NOT NULL,
date DATE NOT NULL,
val1 NUMERIC,
val2 NUMERIC,
val2 NUMERIC,
val4 NUMERIC,
val5 INTEGER,
PRIMARY KEY (id1, id2, date)
);
I am trying to run:
SELECT count(ROWID) FROM my_table
The query has now been running for several minutes which seems excessive to me. I am aware that sqlite is not optimized for count(*)-type queries.
I could accept this if at least my machine appeared to be hard at work. However, my CPU load hovers somewhere around 0-1%. "Disk Delta Total Bytes" in Process Explorer is about 500.000.
Any idea if this can be sped up?
You should have an index for any fields you query on like this. create index tags_index on tags(tag);. Then, I am sure definitely the query will be faster. Secondly, try to normalize your table and have a test (without having an index). Compare the results.
In most cases, count(*) would be faster than count(rowid).
If you have a (non-partial) index, computing the row count can be done faster with that because less data needs do be loaded from disk.
In this case, the primary key constraint already has created such an index.
I would try to look at my disk IO if I were you. I guess they are quite high. Considering the size of your database some data must be on the disk which makes it the bottleneck.
Two ideas from my rudimentary knowledge of SQLite.
Idea 1: If memory is not a problem in your case and your application is launched once and run several queries, I would try to increase the amount of cache used (there's a cache_size pragma available). After a few googling I found this link about SQLite tweaking: http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html
Idea 2: I would try to have an autoincremented primary key (on a single column) and try to tweak my request using SELECT COUNT(DISTINCT row_id) FROM my_table; . This could force the counting to be only run on what's contained in the index.

Access 2010 Subform The data was added to the database but the data won't be displayed

I have a strange one here that I just can't seem to figure out.
My Access front-end project runs on an SQL 2005 express backend.
I have been using subforms for donkeys years and it's the only reason why I haven't migrated the application to a VB/VS front end.
However, since upgrading to Access 2010 I cannot get subforms to work. Instead, when I try to add a row, I get the following error (The data was added to the database but the data won't be displayed in the form because it doesn't satisfy the criteria in the underlying record source.):
The master and child forms are linked on poid and PONo.
I have created forms from scratch with all defaults, but still the issue remains.
My SQL tables are
PURCHASE:
- -
poid, int, PK, Identity, seed 1, inc 1
supplierID, int
orderdate, DateTime
deliverydate, datetime
ordersent, bit
ordercomplete, bit
initials, nvarchar
supplierinvoiceno, nvarchar
branchid, int
bookedin, bit
deliverycharge, money
[STOCK - Detail]:
- -
stockid, int, PK, Identity, Seed 1, inc 1
CodeID, int
service, bit
costprice, money
PONo, int
Instock, bit
SerialNo, char
StockTake, bit
Branch, Char
ProductID, int
Any help would be very much appreciated.
Many thanks,
Abe
Solved! Access 2010 does not support multiple tables with identical column names unless it is in a stored procedure / query on the SQL server.
I've been trying to come away from stored procs & queries, but A2010 will not, under any combination work with hard coded SQL as the record source.
Once I created a query and selected it as the record source the subforms worked perfectly as expected.
Also I had to alias any fields that have the same name in both tables EVEN if not selected in the query. And yes, the Alias only worked in the query too!
I love Microsoft! ;-)

Seeking advice on how to structure the SQL Server 2008 DB table with large amount of data?

I am planning a web application (programmed using ASP.NET) that manages the database of logged events. The database will be managed in an SQL Server 2008. Each event may come from a set of, let's call them, "units." A user will be able to add and remove these "units" via the ASP.NET interface.
Each of the "units" can potentially log up to a million entries, or maybe even more. (The cut off will be administered via a date. For instance:
DELETE FROM [tbl] WHERE [date] < '01-01-2011'
The question I have is what is the best way to structure such database:
By placing all entries for all "units" in a single table like this:
CREATE TABLE tblLogCommon (id INT PRIMARY INDEX,
idUnit INT,
dtIn DATETIME2, dtOut DATETIME2, etc INT)
Or, by separating tables for each "unit":
CREATE TABLE tblLogUnit_1 (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
CREATE TABLE tblLogUnit_2 (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
CREATE TABLE tblLogUnit_3 (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
--and so on
CREATE TABLE tblLogUnit_N (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
Approach #1 seems simpler from a standpoint of referencing entries because with approach #2 I'll have to deal with variable N number of tables (as I said users will be allowed to add and remove "units.)
But approach #1 may render access to those log entries later very inefficient. I will have to generate reports from those logs via the ASP.NET interface.
So I'd like to hear your take on this before I begin coding?
EDIT: I didn't realize that the number of columns in a table makes a difference. My bad! The actual number of columns in a table is 16.
I would go with approach 1, as the table does not seem very large(width wise) and yuo could apply indexes to improve searching/selecting.
Further to this, you could also look at partitioned tables and indexes.
Creating Partitioned Tables and Indexes
Splitting in separate tables is going to yield better insert and search speed.
With one table the difference is an index on idUnit. With that index search speed is going to be nearly as fast as separate tables (and you can search across idUnits is a single query). Where one table is going to take a hit is insert but that is a small hit.
A lot depends on how you intend to use this data. If you split the data into multiple tables, will you be querying over multiple tables, or will all your queries be within the defined date range. How often will data be inserted and updated.
In other words, there's no correct answer!
Also, can you afford a license for SQL enterprise in order to use partitioned tables?
I did some tests on the actual data with SQL Server 2008 Express, using local computer connection, no network latency. The computer this was tested on: Desktop, Windows 7 Ultimate, 64-bit, CPU: i7, #2.8GHZ, 4 cores; RAM: 8GB; HDD (OS): 1TB, 260GB free.
First all records were located in a "SINGLE" table (approach #1). All records were generated with random data. A complex SELECT statement processing each particular "unitID" was tried two times (one immediately after another), with CPU load: 12% to 16%, RAM load: 53% - 62%. Here's the outcome:
UnitID NumRecords Complex_SELECT_Timing
1 486,810 1m:26s / 1m:13s
3 1,538,800 1m:13s / 0m:51s
4 497,860 0m:30s / 0m:24s
5 497,860 1m:20s / 0m:50s
Then the same records were separated into four tables with identical structure (approach #2). I then ran the same SELECT statement two times as before, on the same PC, with identical CPU and RAM loads. Next are the results:
Table NumRecords Complex_SELECT_Timing
t1 486,810 0m:19s / 0m:12s
t3 1,538,800 0m:42s / 0m:38s
t4 497,860 0m:03s / 0m:01s
t5 497,860 0m:15s / 0m:12s
I thought to share this with whoever is interested. This pretty much gives your the answer...
Thanks everyone who contributed!

Resources