Writing a cache-everything/quick-response HTTP proxy [closed]

Writing a cache-everything/quick-response HTTP proxy [closed] - http

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Are there any open source HTTP caching proxies I can use to give myself a good starting point?
I want to write a personal HTTP caching proxy to achieve the following purposes
Serve content instantly even if the remote site is slow
Serve content even if the network is down
Allow me to read old content if I'd like to
Why do I want to do this?
The speed of Internet connection in my area is far from spectacular.
I want to cache contents even if the HTTP headers tell me not to
I really don't like it when I couldn't quickly access content that I've read in the past.
I feel powerless when a website removes useful content and I find no way to get it back
The project comprises
A proxy running it on the local network (or perhaps on localhost), and
A browser plugin or a desktop program to show content-updated notifications
What's special about the proxy?
The browser initiates an HTTP request
The proxy serves the content first, if it's already in the cache
Then the proxy contacts the remote website and check whether the content has been updated
If the content has been updated, send a notification to the desktop/browser (e.g. to show a little popup or change the color of a plug-in icon), and download the content in the background.
Every time the proxy download new content, save it into the cache
Let me choose to load the updated content or not (if not, stop downloading the new content; if yes, stream the new content to me)
Let me assign rules to always/never load fresh content from certain websites
Automatically set the rules if the proxy finds that (1) I always want to load fresh content from a certain website, or (2) the website's content frequently updates
Note:
Caching everything does not pose a security problem, as I'm the only one with physical access to the proxy, and the proxy is only serving me (from the local network)
I think this is technologically feasible (let me know if you see any architectural problems)
I haven't decided whether I should keep old versions of the webpages. But given that my everyday bandwidth usage is just 1-2 GB, a cheap 1TB hard drive can easily hold two years of data!
Does my plan make sense? Any suggestions/objections/recommedations?

Take a look at polipo:
http://www.pps.univ-paris-diderot.fr/~jch/software/polipo/
Source is here:
https://github.com/jech/polipo
It is a caching web proxy implemented in C. It should definitely help you.

Related

Force HTTP1.1 instead of HTTP2 through Proxy (Charles) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
Since we updated our clients to HTTP2, I've had problems with mapping files to local resources. We normally use Charles (App) to do this, but since we updated to HTTP2, we've had some errors.
It seems to cut the files short and only load a tiny part of the files. Charles then gives a Failure message back saying:
Client closed connection before receiving entire response
I've been looking through the big interwebs for answers, but haven't been able to find any yet.
Hopefully there's some brilliant minds in here.

We have addressed this issue in Charles 4.1.2b2. Please try it out from https://www.charlesproxy.com/download/beta/
Please let me know if this does or doesn't correct the issue for you! We plan to roll out this build to release pretty soon, especially once we've had more users confirm the solution.

One workaround I've found is using the disable-http2 flag when launching Chrome. In MacOS the terminal command would be:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --disable-http2
In windows you could alter your shortcut to launch with that --disable-http2 option.

As you said the problem is raised since client has been updated, have you double check all point relative to any client cache issue ? ( see here about no-caching tool in Charles)
You may use "Upgrade header" to force a change of http protocol version:
The Upgrade header field is a HTTP header field introduced in HTTP/1.1. In the exchange, the client begins by making a cleartext request, which is later upgraded to a newer http protocol version or switched to a different protocol. Connection upgrade must be requested by the client, if the server wants to enforce an upgrade it may send a 426 upgrade required response. The client can then send a new request with the appropriate upgrade headers while keeping the connection open.

How to speed .NET MVC site deployed on AZURE? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Of course i want to reach maximum perfomance.
What can I do for it?
Use Bundles for CSS & JS files? Ok.
What kind of storage shold I use? Now its SQL Database.
But site and DB are placed in different regions. Size of DB will be not too big -1 gb - is enough. And - how to reduce query-time. Now - it's too long.
Should I turn on "always on" feature for my site?
Is there anything else? Is the any article ti read?
Thanks in advance.

There is only so much optimization you can do - if you really want "maximum performance" then you'd rewrite your site in C/C++ as a kext or driver-service and store all of your data in memcached, or maybe encode your entire website as a series of millions of individual high-frequency electronic logic-gates all etched into an integrated circuit and hooked-up directly to a network interface...
...now that we're on realistic terms ;) your posting has the main performance-issue culprit right there: your database and webserver are not local to each other, which is a problem: every webpage users request is going to trigger a database request, and if the database is more than a few miliseconds away then it's going to cause problems (MSSQL Server has a rather chatty network protocol too, which multiplies the latency effect considerably).
Ideally, total page generation time from request-sent to response-arrived should be under 100ms before users will notice your site being "slow". Considering that a webserver might be 30ms or more from the client, that means you have approximately 50-60ms to generate the page, which means your database server has to be within 0-3ms of your webserver. Even 5ms latency is too great because something as innocuous as 3-4 database queries is going to incur a delay of at least 4 * ( 5ms + DB read time)ms - DB read-time can vary from 0ms (if the data is in memory) or up to 20ms if it's on a slow platter drive, or even slower depending on server-load - that's how you can easily find a "simple" website taking over 100ms just to generate on the server, let alone send to the client.
In short: move your DB to a server on the same local network as your webserver to reduce the latency.

The immediate and simplest way to start in your conditions is to move the database and the site in the same datacenter.
Later you may think to:
INSTRUMENT YOUR CODE
Add (Azure Redis) Cache
Load balance your web site (if it is charged enough)
And everything around compacting/bundling/minimizing your code.
Hope it helps,

Difference between http:// and http://www [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Can someone please explain the difference between http:// and http://www and the effects of using each of these?
I tried to google but I didn't get much insight. I looked for it on Stackoverflow too, but couldn't find anything helpful.

http://
is a protocol, it tells the client and server what type of connection is being made. Think of it as your browser saying to the server "I am about to send you a HTTP (HyperText Transport Protocol) request". The server then knows how to "see" the request and respond.
Imagine I can speak 8 languages, but can't recognize them without being told. The http:// is the equivalent of saying to me
English: Hi Jon, how are you?
The English: tells me that you are about to speak English, and that I should respond in the same language.
The http:// simply tells the client and server how to talk to each other
www
This is just a hostname within the domain
eg
www.website.com
In many (dare I say most?) cases, www.website.com is set up to point directly at website.com, but it could easily be set up to point somewhere else. This very rarely happens, because people have come to expect www. in the URL
This is the equivalent of running a business from your home having several doors in your house, if someone comes in one door, they want your sweet shop. If they come in another, they are coming to your toy shop, and if they use the third, they are visiting your family for a cup of tea
A typical domain will have multiple "hosts", often providing different services. These may be on one server, or each may have it's own server.
eg
ftp.website.com
Points to the server which provides FTP services (file transfer protocol, for sending files)
smtp.website.com
Provides email services, and
www.website.com
Provides "World Wide Web" services (ie a website)
This was the original intention, when typically a company had one domain (company.com) and one example of each service. Nowadays, website use has expanded massively and we will typically provide many more services/websites from one domain, so will use other sub-domains for websites. This is akin to telling you where to find a service. "Oh you want the pub? It's down the road and round the corner"
eg
customers.website.com
May be for a customer login portal, but is still a website. www.website.com has simply endured as the typical use for a "homepage" website, because it has become ingrained in most people's mind.

Creating a P2P / Decentralized file sharing network [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I was wondering where I could learn more about decentralized sharing and P2P networks. Ideally, I'd like to create something to help students share files with one another over their universities network, so they could share without fear of outside entities.
I'm not trying to build the next Napster here, just wondering if this idea is feasible. Are there any open source P2P networks out there that could be tweaked to do what I want?

Basically you need a server (well, you don't NEED a server, but it would make it much simplier) that would store user IPs between other things like file hash lists, etc.
That server can be in any enviroinment you want (which is very comfortable).
Then, each client connects to the server (it should have a dns, it can be a free one, I've used no-ip.com once) and sends basic information first (such as its IP, and a file hash list), then sends something every now and then (say each 5 minutes or less) to report that it's still reachable.
When a client searchs files/users, it just asks the server.
This is a centralized network, but the file sharing would be done in p2p client-to-client connections.
The reason to do it like this is that you can't know an IP to connect to without some reference.
Just to clear this server thing up:
- Torrents use trackers.
- eMule's ED2K uses lugdunum servers.
- eMule's "true p2p" Kademlia uses known nodes (clients) (most of the time taken from servers like this).

Tribler is what you are looking for!
It's a fully decentralized BitTorrent Client from the Delft University of Technology. It's Open Source and written in Python, so also a great starting point to learn.

Use DC++

What is wrong with Bit-Torrent?
Edit: There is also a pre-built P2P network on Microsoft operating systems that is pretty cool as the basis to build something. http://technet.microsoft.com/en-us/network/bb545868.aspx

Is there a simple app for pinging a list of websites? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
Basically, I just need a simple app that frequently pings external IP Addresses and web addresses to make sure the sites are up. Does anyone know of a good one of these?
I started to make one myself, but wanted to know if someone else has already done the work.
It just needs to track multiple external addresses with the status codes returned, at potentially different intervals.
I did see this post on "How do you monitor the availability of multiple websites", but it seems a little bit like overkill for what I need. I need a KISS app! Thanks!

Ok, second attempt. What about Website Monitor (seen in this list: Monitor and Check Web Site or Server Uptime and Availability for Free)? Your dog should be able to use it.

I'm not sure if this fits your needs but
http://aremysitesup.com/
May be a simple way to go.
The free version supports up to five sites.

This can be done with Cacti which is a great app. See:
Http Response Time monitoring and Alerting on the Cacti forums
How do you Monitor a https website and graph uptime/latency? on the Cacti forums
Cacti: Using Cacti to monitor web page loading blog posts serie
Use Cacti to Monitor HTTP Status Codes of Request Responses?) here on SO

Unless you are the network admin of those sites it is a colossal waste of resources, what I call ping-then-do.
Ping-then-do

use command prompt if you are on a windows system.
type in :
ping (website host name)
and then press enter, it will ping the website and give you the time that the website took to respond as well as the TTL

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex