I have developed a Python (Requests) and Java code to scrap data from a Website. And it will work by continuously refresh the website for new data.
But the Website recently identified my scraper as an Automated Service and my account had been Locked out. Is there any way to hide this refreshes to get new data without account lock?
It depends on which website it is, in any case, the scraper simulates an user behavior, which would still be blocked.
If the website detects timed tasks a solution might be to randomize a refresh time of your application.
If the website will presents a captcha code, you have no easy solution
If the website just counts the visit from a particular IP address, you might set up a dynamic proxy server to simulate requests from other IPs
Related
I set up a free domain on 000webhost.com
I am using this as a web server to receive data from SIM908+arduino setup and store it in the database. Then display it on a web page.
I am sending the data from the SIM908 using HTTP GET requests. Basically I am sending two pieces of information, one is the location (lat and long) and other is a string. Both are sent using GET requests. The problem is very unusual so bear with me. EVERYTHING WORKS FINE, for a while. After several GET requests are sent, for some reason, 000webhost just deactivates my domain. I simply cannot access it. Every time I try to browse to the page it times out. It remains like this for around 7-8 hours after which the domain works fine again. I tried another hosting byethost.com, but GET requests from the SIM908 do not work there at all. Everything is 100% OK. The code, arduino setup everything is fine. My question is why is 000webhost stopping my domain? Really need a good answer or at least some direction, i am completely lost.
**NOTE: Please don't suggest POST method unless you explicitly know how to perform a POST operation using SIM908 AT commands, as far as I know it's not possible.
You are using the free webhost which has limitations. They will block you if your site is getting too much requests. Just read the limitations of free accounts with the server.
Look for a better free service or buy one. There is no issue with sim900 or arduino.
The following hosting service providers might be better than the one you are currently using in terms of limitations
Host Buddy You would get two months free
Free Hostia
Free Hosting .eu
I build ASP.NET websites (hosted under IIS 6 usually, often with SQL Server backends and forms authentication).
Clients sometimes ask if I can check whether there are people currently browsing (and/or whether there are users currently logged in to) their website at a given moment, usually so the can safely do a deployment (they want a hotfix, for example).
I know the web is basically stateless so I can't be sure whether someone has closed the browser window, but I imagine there'd be some count of not-yet-timed-out sessions or something, and surely logged-in-users...
Is there a standard and/or easy way to check this?
Jakob's answer is correct but does rely on installing and configuring the Membership features.
A crude but simple way of tracking users online would be to store a counter in the Application object. This counter could be incremented/decremented upon their sessions starting and ending. There's an example of this on the MSDN website:
Session-State Events (MSDN Library)
Because the default Session Timeout is 20 minutes the accuracy of this method isn't guaranteed (but then that applies to any web application due to the stateless and disconnected nature of HTTP).
I know this is a pretty old question, but I figured I'd chime in. Why not use Google Analytics and view their real time dashboard? It will require minor code modifications (i.e. a single script import) and will do everything you're looking for...
You may be looking for the Membership.GetNumberOfUsersOnline method, although I'm not sure how reliable it is.
Sessions, suggested by other users, are a basic way of doing things, but are not too reliable. They can also work well in some circumstances, but not in others.
For example, if users are downloading large files or watching videos or listening to the podcasts, they may stay on the same page for hours (unless the requests to the binary data are tracked by ASP.NET too), but are still using your website.
Thus, my suggestion is to use the server logs to detect if the website is currently used by many people. It gives you the ability to:
See what sort of requests are done. It's quite easy to detect humans and crawlers, and with some experience, it's also possible to see if the human is currently doing something critical (such as writing a comment on a website, editing a document, or typing her credit card number and ordering something) or not (such as browsing).
See who is doing those requests. For example, if Google is crawling your website, it is a very bad idea to go offline, unless the search rating doesn't matter for you. On the other hand, if a bot is trying for two hours to crack your website by doing requests to different pages, you can go offline for sure.
Note: if a website has some critical areas (for example, writing this long answer, I would be angry if Stack Overflow goes offline in a few seconds just before I submit my answer), you can also send regular AJAX requests to the server while the user stays on the page. Of course, you must be careful when implementing such feature, and take in account that it will increase the bandwidth used, and will not work if the user has JavaScript disabled).
You can run command netstat and see how many active connection exist to your website ports.
Default port for http is *:80.
Default port for https is *:443.
I was wondering if it is even possible to interact with other websites using my own.
Here is the scenario:
Lets say I have a Lockerz account, which is a place where you do daily tasks to earn points. Once a month you can redeem those points to get prizes such as an ipod, macbook, or other items. I know that sounds rediculous, but stay with me.
For someone to gain membership to this website they must be invited by a member. So I get your email address then log in to my account, then send you an invite from there.
What I want to do is create a website where a user enters their email into a textbox and presses a submit button. From there the program, behind the scenes, sends my login information, and the users email address to lockerz and sends the invite. All without ever leaving my site.
I have worked with ASP.NET with VB codebehind for a while now, so I understand the basics of that. I am just wondering if what i want to do is even possible. If so, can someone redirect me to a tutorial or guide of some kind that will give me a basic knowledge on this.
Thanks
You'll have to work down at the HTTP level, sending POST and GET requests.
Fortunately, .NET has the WebRequest and WebClient classes to help you.
WebClient would probably be your best starting point... But I would hang on a second.
Websites like this tend to employ some pretty intense fraud-protection. Banning, blocking or at least ignoring actions when multiple accounts use one IP, or otherwise do things in a predictable pattern.
WebClient isn't going to load up the JavaScript either so you might you can't access required parts of the page.
Either way, you don't need to do this on your webserver - I'd start off by writing it initial connect code locally as a simple script. It'll make testing it a lot faster.
Is it possible to embed an external application inside the browser (IE, Chrome, Safari, Firefox) so it will look like a native web application but actually having access to the USB ports of the client machine? I have heard that I need to make an ActiveX control. I would like to use the .Net framework, but if that is not possible, maybe using Java or C++ will be fine.
I have to make an application that will allow to the users to connect an external device to an USB port, this device will take a backup of the information contained in a SIM card and send it to the user's account online agenda. So the user can restore it later using the same application. This should be a web application or at least look like one.
If the first is not possible. Is there any way to launch an external application from all the browsers, and then pass information to the browser window to allow it to refresh after the backup has been made?
Thanks for your help in advance.
First off this seems to be a big security issue and hence this is the reason why you might be finding it tricky.
What I would do is look at it from a different angle; what am I trying to achieve? How is the user going to use the data? Where is the user going to use the data?
From you question I have answered those questions with the following; I hope I've not miss interpretted anything.
I want to copy the data from an external sim card to a central location
I want the user to see this data from the central location; preferablly from a web application.
The user is going to see and use the data from the web app
Assuming all of these things are true; one design option is the following:
1 - Have a client based application which can read stuff from the usb device
2 - Have a secure webservice which the client based application can upload the data too
3 - Have a web application which can view this data and see refreshes
Let me go into bit more detail for each step.
1 - If you write a small client application it is installed or at least runs on the client computer. Due to this it can access the local client resources such as usb and interface with them. This will mean they can read the sim data throuogh this app, buut also potentially save it locally as well as upload the data. To access the web service they would enter their username/password so you could authenticate them for the upload.
2 - This web service would do the authentication from the client application, but also receive the data submitted from the client app. Acessing web services from .net now a days is really straight forward. Using this web service the client application could also do some checking to make sure the data has been updated and it could handle re-tries if the network dropped etc.
3 - The web front end of the system would interface to the same data source. This site would take the username / password to authenticate them on the site, but also let them see the uploaded data. As for the refreshes; if the user is logged in and looking at the data you could have a javascript timer polling an action/service to see when new records have been added etc. This could then display a message through jQuery or similar to notifiy the user. This could be similar to the notifications which StackOverflow gives when you visit for the first time or get a new badge etc.
Hope this helps :-)
I am looking into different ways to handle updating an ASP.NET application across many different clients, and looking for suggestions from your previous experience.
We need the client apps to check if they have any available updates.
A way to Auto update (If possible, something similar to chromes but for webapp).
Some way to check that we are the ones sending the updates. (Checksum of some sort I would guess)
Any other tips/advice
Thanks
Edit: after thinking more about this overnight, I would have to agree that auto updates may not be the best. However maybe something more along the lines of how wordpress does it. Wordpress will display saying that there is an update available then clicking will auto update the system.
I would absolutely not have your application auto update on clients (assuming you mean clients are entities external to your organization) servers. We would immediately stop using a product that would "phone home" and update itself. Clients need to be able to choose when and how an update on their server occurs.
If you are going to do this, the easiest way would be to set up URI the systems could ping once a day etc. and see if there are updates available. If so the application would pull it down from the host system and update itself. If you do it using SSL then the certificate would verify the URI being hit is your company.
Having a hard time figuring out if you're actually talking about a web application or a desktop application. If you're trying to do something similar to Chrome...I'm guessing a desktop application. If that's the case...check out ClickOnce deployment.
It offers the first three bullet points you mention:
Every time a person runs the app, it will check for updates.
If updates are found, the user can choose to install them or not (better user experience than forcing the update on the user).
The application always checks the URL that the app was installed from...which in your case would be your servers.