Using VB.NET to Detect Changes in a Web Page - asp.net

Again I come to you guys for your expertise and advice on an issue that I am having. I was wondering if any of you would know how to detect if a web page has been modified using VB.NET. I need to be able to set up a task which periodically (like once a week) scans the user inputted web pages and if the web page content has changed, I need to fire off an email to an individual that it has changed (not the exact location on the page itself). I'll be storing the HTTP status and of course the page data itself as well as the date of when it was last modified. Of course this needs to be very fault tolerant since it could be another week before the check runs again. Any help would be great. Thank you.
EDIT
New twist on this question sorry. I had more time to think about what we wanted. So... Detecting ANY change on a web page would be kind of silly since time dependent elements of the page would change every so often. Instead, what I would like to do is be able to detect the documents in the page. For instance if there are excel, word docs, or pdfs that get changed on that page. So, I'd run the hash on these documents then on some sort of schedule do a check to see if new documents have been added or if the old documents have been modified. Any suggestions on how to detect the documents embedded on the page and running the hash? Thanks again!

As I mentioned in a comment, this sort of job is what checksums (also known as hash functions) were designed for.
You code for will look something like this:
- for each webpage of interest
- pull webbpage
- calculate checksum of contents
- is current checksum different to last checksum?
- if yes, send email
- store new checksum and other appropriate data
The .Net framework has a number of checksums available. The two most popular are MD5 and sha1

In addition to the checksum option, there are also various Diff function that achieve this, and provide much more information than changed=true/false. This question has more info:
How to tell when a web page has changed by x% in VB.net?

Related

Decoding device information from user agent

Recently, I've add the user agent string when the guests submit the form to the database. There is a report that is generated weekly containing various statistics. I want to add the device and maybe the browser information to the report.
I was pondering that I would create a new database table that would hold all the know user agent strings and have two extra fields, one for the device info, and maybe the browser in the other one. However, I cannot find a site that you can download the strings. Would any one know of a place?
If that can not be done, I was thinking of a .net alternative. How would I go into doing that in .net?
2 ways to do it:
If you are using ASPNET MVC, you could use the default this.Request.Browser within the controller method call (contains quite a lot of info, example here),
You can also use 51Degrees, which has a light and a complete device db to match devices capabilities

I have multiple users, can i lock the web page so that only one user at a time can update a record?

Can anyone help or provide me with some suggestions for the below query.
I have a web form (Minutes of Meeting) and 8 users that need to access this web page and update their area. A user may have more than one area to update and essentially i would like to some how lock down the web page if possible when a user is using it so that no other user can update this web page till joe bloggs has finished with it.
I have a Active Directory security group set up to restrict the site to that group of users only, but i need to think of a solution to the above?
Is there a way i can do this via a web control or via SQL?
There must be better ways to do it. However, Is it possible for you to introduce a sql table column similar to "UpdateInProgress" (bit). Any update process sees that column, If 0 then It updates to 1 and after It saves the changes and updates back to 0 so that the form is available for other to update. If update process sees 1, It can't update the web form because update is in progress.
I also suggest to introduce another column named "UpdateInProgressBy" to check who has opened it for editing.
First of all we must note that there is a big time from the moment the user reads the data, get it in a page, change them and then try to write them back. So we are not talking for the lock command on SQL, nether any other lock that happens in milliseconds and help to synchronize threads, but here we must synchronize people and what they write.
There is also a problem if the user leave the page for any reason and this can make the data lock for ever.
This problem can solve with two approaches.
the easy one, when a user try to save data you must check if the same data have been change in the middle, and warn him, or show a merge dialog, or merge programmaticall, or something similar - I do not know what you won.
the difficult way is to constantly monitor the page that read and change the data, and keep this monitor results on a common table in the data base, and there if a user have been and stay on page, the rest users get a warning and read only data, until the user go.
This monitor must be made with javasript and must know even if a user abandon the page.
SET TRANSACTION ISOLATION LEVEL as SERIALIZABLE
for more information check this link:
http://msdn.microsoft.com/en-us/library/ms173763.aspx

Datasource best practices question

This is probably not a difficult project, but I'm not sure of the best way to handle it.
I have an ASP.Net page that needs to query a db for some info (a list of about 12 email addresses) that are used throughout the single-page application (basically a set of 8 buttons, each of which puts an entry into another DB table which includes a message [different for each button] and the email address [from the first db] the message should be sent to).
The list of addresses rarely changes. At what point should my application query the DB for the addresses? Doing it at the button press seems like a waste, since I'll be making the same query and obtaining the same results over and over. I was thinking of opening my datasource and using a SqlDataReader and storing the list of email addresses into a string array, but where is the best place to do that so the data is persisted, yet not queried repeatedly (as you may be able to tell, I am not great at ASP, and I'm still fuzzy on what the lifetime of variables are - application, session, or just while the page is processing).
Any help is appreciated.
Thanks
Adam
Use Cache. Look for GetProductData() method implementation in this page
In your data layer, put the results of the query into the cache.
In the method, you first check if the cache entry exists, if not, you call the DB and populate the cache with the results. If it does, you return the cached values.
You need to looking into using the Cache.
The Cache is most likely what you need:
http://msdn.microsoft.com/en-us/library/6hbbsfk6.aspx
Short version is that you can stick an object in there and set expiration conditions. Such as having it expire after a fixed amount of time, after a certain amount of time goes by without it being accessed, when another value in the cache changes, or when the underlying data in the database changes.
I usually wrap my caching in properties/methods that will attempt to get the value from cache if it is present and then go back to the database if it is not (either has never been read before or has expired).

How to hack proof a data submission program

I am writing a score submission system for games where I need to ensure that reports back to the server are not falsified (aka, hacked).
I know that I can store a password or private passkey in the program to authenticate or encrypt the request but if the program is decompiled, a crafty hacker can extract the password/passkey and use it to falsify reports.
Does a perfect solution exist?
Thanks in advance.
No. All you can do is make it difficult for cheaters.
You don't say what environment you're running on, but it sounds like you're trying to solve a code authentication problem*: knowing that the code that is executing is actually what you think it is. This is a problem that has plagued online games forever and does not have a good solution.
Common ways in which such systems are commonly broken:
Capture, modification and replay of submissions to the server
Modifying the binary to allow cheating
Using a debugger to modify the submission in-memory before the program applies signatures/encryption/whatever
Punkbuster is an example of a system which attempts to solve some of these problems: http://en.wikipedia.org/wiki/PunkBuster
Also consider http://en.wikipedia.org/wiki/Cheating_in_online_games
Chances are, this is probably too hard for your game. Hiding a public key in your binary and signing everything that leaves it will probably put you well ahead of the pack, security-wise.
* Apologies, I don't actually remember what the formal name for this is. I keep thinking "running code authentication", but Google comes up with nothing for the term.
There is one thing you can do - record all of the user inputs and send those to the server as part of the submission. The server can then replay the inputs through a local copy of the game engine to determine the score. Obviously this isn't appropriate for every type of game, though. Depending on the game, you may need to include replay protection.
Another method that may be appropriate for some types of games is to include a video recording of the high-scoring play within the submission. Provide links to the videos from the high score table, along with a link to report suspicious entries. This will let you "crowd-source" cheat detection - if a cheater's score hits the table at number 1, then the players behind scores 2 through 10 have a pretty big incentive to validate the video for you. If a score is reported enough times, you can check the video yourself and decide if it should be removed (and the user banned).

How to handle concurrency control in ASP.NET Dynamic Data?

I've been quite impressed with dynamic data and how easy and quick it is to get a simple site up and running. I'm planning on using it for a simple internal HR admin site for registering people's skills/degrees/etc.
I've been watching the intro videos at www.asp.net/dynamicdata and one thing they never mention is how to handle concurrency control.
It seems that DD does not handle it right out of the box (unless there is some setting I haven't seen) as I manually generated a change conflict exception and the app failed without any user friendly message.
Anybody know if DD handles it out of the box? Or do you have to somehow build it into the site?
Concurrency is not handled out the of the box by DD.
One approach would be to implement this on the database side, by adding a "last updated" timestamp column (or other unique stamp, such as a GUID) to each table.
You then create an update trigger for each table. For each row being updated, is the "last updated" stamp passed in the same as the one on the row in the database?
If so, update the row, but give it a new "last updated" stamp.
If not, raise a specific "Data is out of date" exception.
On the client side, for each row you update, you'd need to refresh the "last updated" stamp.
In the client code you watch for the "Data is out of date" exception and display a helpful message to the user, asking them to refresh the data and re-submit their change.
Hope this helps.
All depends on the definition, what do you mean under "out of the box". Of cause you have to create a lot of code to handle concurrency, but some features help us to implement it.
My favorite model is "optimistic concurrency" based on rowversion datatype of SQL Server. It is like "last updated" timestamp, but you need not use any update trigger for each table. All updates of the corresponding "timestamp" column in your tables will be made automatically by SQL server at every update of data in the table row. I describes it in my old answer Concurrency handling of Sql transactrion. I hope it will be helpful for you.
I was of the impression the Dynamic data does the update on the underlying data source. Maybe you can specify the concurrency model (pessimistic/optimistic) on the data meta model that gets registered on the App_Init section. But you would probably get unable to save changes error, so by default would be pessimistic, last in loses....
Sorry to replay late. Yes DD is too strong when it come to fast development of project. Not only that it is base for .Net 4.0. DD is more enhance and have been included in .Net 4.0.
DD mostly work on Linq to sql. I will suggest you to have a look on that part.
In linq to SQl when you go to property of table you will find a property there which specify wheater to check the old value before updating new value. If you set that true I think your proble will get handle.
wish you best luck.
Let's learn from each other.
The solution given by Binary Worrier works and it's widely used on platforms providing a GUI to merge the changes (e.g. source control programs, wiki engines, etc). That way none of the users lose their changes. In the other hand, it requires much code or using external components or DLLs.
If you are not happy with that, another approach is just to lock the record that is being edited. Nobody else will be able to edit that record until the user commit the changes or his session expires. It has pros and cons but requires little code compared with the first option.

Resources