API caching with Symfony2

API caching with Symfony2 - symfony

How can I cache my API responses built with Symfony?
I started to dig into FosCacheBundle and the SymfonyHttpCache, but I'm not sure about my usecase.
You can only access the API with a token in header and every users get the same data in their response for the same URL called (and with the same GET parameters).
I would like to have a cache entry for each of my URL (including get parameters)
and also, is it possible to reorder my GET parameters before the request is processed by my cache system (so that the system dont create multiple cache entries for "URL?foo=bar&foz=baz" and "URL?foz=baz&foo=bar" which returns the same data)

Well there are multiple ways.
But the simplest is this:
If the biggest problem is database access than just caching the compiled result in memcache or similar will go a long way. Plus, this way you stick to your already working authentication.
In your current controller action, after authentication and before payload creation check if there's an entry in memchache. If no, build the payload and save it into memcache than return it. When next request comes along there will be no DB access as it will be returned from memcache. Just don't forget to refresh the cache how ever often you need.
Note:
"Early optimization is the root of all evil" and "Servers are cheaper than programmer hours" are to things to keep in mind. Don't complicate your life with really advanced caching methods if you don't need to.

Related

How to cache an api response?

I'm using the api http://exchangeratesapi.io/ to get exchange rates.
Their site asks:
Please cache results whenever possible this will allow us to keep the service without any rate limits or api key requirements.
-source
Then I found this:
By default, the responses all of the requests to the exchangeratesapi.io API are cached. This allows for significant performance improvements and reduced bandwidth from your server.
-somebody's project on github, not sure if accurate
I've never cached something before and these two statements confuse me. When the API's site says to "please cache the results", it sounds like caching is something I can do in a fetch request, or somehow on the frontend. For example, some way to store the results in local storage or something. But I couldn't find anything about how to do this. I only found resources on how to force a response NOT to cache.
The second quote makes it sound like caching is something the API does itself on their servers, since they set the response to cache automatically.
How can I cache the results like the api site asks?

To clear your confusion on the conflicting statements you're referencing:
Caching just means to store the data. Examples of where the data can be stored are in memory, in some persistence layer (like Redis), or in the browser's local storage (like you mentioned). The intent behind caching can be to serve the data faster (compared to getting it from the primary data source) for future requests/fetches, and/or to save on costs for getting the same data repeatedly, among others.
For your case, the http://exchangeratesapi.io/ API is advising consumers to cache the results on their side (as you mentioned in your question, this can be in the browser's local storage, if you're calling the API front front-end code, or stored in memory or other caching mechanisms/structures on the server-side application code calling the API) to that they can avoid the need to introduce rate limiting.
The project from Github you're referencing, Laravel Exchange Rates, appears to be a PHP wrapper around the original API - so it's like a middleman between the API and a developer's PHP code. The intent is to make it easier to use the API from within PHP code, and avoid having to make raw HTTP requests to the API and avoid processing the responses; the Laravel Exchange Rates handles that for the developer.
In regards to the
By default, the responses all of the requests to the exchangeratesapi.io API are cached
statement you're asking about, it seems the library follows the advice of the API, and caches the results from the source API.
So, to sum up:
http://exchangeratesapi.io/ is the source API, and it advises consumers to cache results. If your code is going to be calling this API, you can cache the results in your own code.
The Laravel Exchange Rates PHP library is a wrapper around that source API, and does cache the results from the source API for the user. If you're using this library, you don't need to further cache.

AWS Lambda, Caching {proxy+}

Simple ASP.Net AWS Lambda is uploaded and functioning with several gets like:
{proxy+}
api/foo/bar?filter=value
api/foo/barlist?limit=value
with paths tested in Postman as:
//#####.execute-api.us-west-2.amazonaws.com/Prod/{proxy+}
Now want to enable API caching but when doing so only the first api call gets cached and all other calls now incorrectly return the first cached value.
ie //#####.execute-api.us-west-2.amazonaws.com/Prod/api/foo/bar?filter=value == //#####.execute-api.us-west-2.amazonaws.com/Prod/api/foo/barlist?limit=value; In terms of the cache these are return the same but shouldn't be.
How do you setup the caching in APIGateway to correctly see these as different requests per both path and query?

I believe you can't use {proxy+} because that is a resource/integration itself and that is where the caching is getting applied. Or you can (because you can cache any integration), but you get the result you're getting.
Note: I'll use the word "resource" a lot because I think of each item in API Gateway as the item in question, but I believe technically AWS documentation will say "integration" because it's not just the resource but the actual integration on said resource...And said resource has an integration and parameters or what I'll go on to call query string parameters. Apologies to the terminology police.
Put another way, if you had two resources: GET foo/bar and GET foo/barlist then you'd be able to set caching on either or both. It is at this resource based level that caching exists (don't think so much as the final URL path, but the actual resource configured in API Gateway). It doesn't know to break {proxy+} out into an unlimited number of paths unfortunately. Actually it's method plus resource. So I believe you could have different cached results for GET /path and POST /path.
However. You can also choose the integration parameters as cache keys. This would mean that ?filter=value and ?limit=value would be two different cache keys with two different cached responses.
Should foo/bar and foo/barlist have the same query string parameters (and you're still using {proxy+}) then you'll run into that duplicate issue again.
So you may wish to do foo?action=bar&filter=value and foo?action=barlist&filter=value in that case.
You'll need to configure this of course, for each query string parameter. So that may also start to diminish the ease of {proxy+} catch all. Terraform.io is your friend.
This is something I wish was a bit more automatic/smarter as well. I use {proxy+} a lot and it really creates challenges for using their caching.

Handle ActionResults as cachable, "static content" in ASP.NET MVC (4)

I have a couple of ActionMethods that returns content from the database that is not changing very often (eg.: a polygon list of available ZIP-Areas, returned as json; changes twice per year).
I know, there is the [OutputCache(...)] Attribute, but this has some disadvantages (a long time client-side caching is not good; if the server/iis/process gets restartet the server-side cache also stopps)
What i want is, that MVC stores the result in the file system, calculates the hash, and if the hash hasn't changed - it returns a HTTP Status Code 304 --> like it is done with images by default.
Does anybody know a solution for that?

I think it's a bad idea to try to cache data on the file system because:
It is not going to be much faster to read your data from file system than getting it from database, even if you have it already in the json format.
You are going to add a lot of logic to calculate and compare the hash. Also to read data from a file. It means new bugs, more complexity.
If I were you I would keep it as simple as possible. Store you data in the Application container. Yes, you will have to reload it every time the application starts but it should not be a problem at all as application is not supposed to be restarted often. Also consider using some distributed cache like App Fabric if you have a web farm in order not to come up with different data in the Application containers on different servers.
And one more important note. Caching means really fast access and you can't achieve it with file system or database storage this is a memory storage you should consider.

Cache data returned by stored procedures?

I have an Asp.Net MVC 3 site. The following is the call stack
Web page/jQuery: $(document).Ready(.... Ajax calls... render the page...)
=> MVC Control methods
=> Entity framework 4.1
=> mapped store procedures (SQL Server 2008)
Question:
Where is the best place to implement cache?
How to let the page know that the underline SQL server tables have been updated?

Not sure about the "best" way to do it but one way to do it would be to have an MVC controller action which calls to the db to check and see if the data has been updated. (You can do it by time-stamp.)
The resulting function will then retreive the data from cache or from the server.
http://davidwalsh.name/cache-ajax
The only interesting thing to note however; is that you should make sure that the call to first find out if you can use cached content is faster than not caching content at all.

Try to add caching as close to the source as possible. This way more of your app could gain benefits from the improved speed.
If you control the code that is modifying the underlying tables you could invalidate the cache from there. You could also place a short timeout on your cache. If its a heavily used query caching it only a second could increase speed many fold. Make sure to test the performance gain so that you can tweak timeouts.

For question #2, you may want to look into Query Notifications. Setting everything up is a bit complicated, but that will enable you to do things such as caching until the data in your database has been updated.

One way is to cache rendered views some specified time.
Let's say that you have page that is not updated often. So instead of hitting database on every visit you can store rendered view in cache. This is achieved using OutputCaching - http://www.asp.net/mvc/tutorials/improving-performance-with-output-caching-cs.
Another way could be to store data.
Here again You can cache it for some specified time. In ASP.NET (MVC) it can be achieved using Cache object - http://msdn.microsoft.com/en-us/library/aa478965.aspx.
Cache object let's you specify how long data is to be cached when You put it in cache. For example:
Cache.Insert("key",
myTimeSensitiveData,
null,
DateTime.Now.AddMinutes(1),
TimeSpan.Zero);
Or you can cache until it is 'invalidated'.
Say you have GetCustomers and UpdateCustomer methods. In GetCustomers you check if data is in Cache. If not you hit the database, put it in cache and return. It is in cache until someone calls UpdateCustomer. In that method you write modified customer to database and invalidate data stored in Cache. You can just remove it. That way when GetCustomers is called again it will hit the database and populate Cache again. But remember that Cache has global scope and is accessible for many threads at the same time. You will need some synchronization code around access to Cache.

Stop Direct Page Calls to Ajax Pages

Is there a "clever" way of stopping direct page calls in ASP.NET? (Page functionality, not the page itself)
By clever, I mean not having to add in hashes between pages to stop AJAX pages being called directly. In a nutshell, this is stopping users from accessing the Ajax pages without it coming from one of your websites pages in a legitimate way. I understand that nothing is impossible to break, I am simply interested in seeing what other interesting methods there are.
If not, is there any way that one could do it without using sessions/cookies?

Have a look at this question: Differentiating Between an AJAX Call / Browser Request
The best answer from the above question is to check for a requested-by or custom header.
Ultimately, your web server is receiving requests (including headers) of what the client sends you - all data that can be spoofed. If a user is determined, then any request can look like an AJAX request.
I can't think of an elegant method to prevent this (there are inelegant and probably non-perfect methods whereby you provide a hash of some sort of request counter between ajax and non-ajax requests).
Can I ask why your application is so sensitive to "ajax" pages being called directly? Could you design around this?

You can check the Request headers to see if the call is initiated by AJAX Usually, you should find that x-requested-with has the value XMLHttpRequest. Or in the case of ASP.NET AJAX, check to see if ScriptMAnager.IsInAsyncPostBack == true. However, I'm not sure about preventing the request in the first place.

Have you looked into header authentication? If you only want your app to be able to make ajax calls to certain pages, you can require authentication for those pages...not sure if that helps you or not?
Basic Access Authentication
or the more secure
Digest Access Authentication
Another option would be to append some sort of identifier to your URL query string in your application before requesting the page, and have some sort of authentication method on the server side.

I don't think there is a way to do it without using a session. Even if you use an Http header, it is trivial for someone to create a request with the exact same headers.
Using session with ASP.NET Ajax requests is easy. You may run into some problems, like session expiration, but you should be able to find a solution.
With sessions you will be able to guarantee that only logged-in users can access the Ajax services. When servicing an Ajax request simply test that there is a valid session associated with it. Of course a logged-in user will be able to access the service directly. There is nothing you can do to avoid this.
If you are concerned that a logged-in user may try to contact the service directly in order to steal data, you can add a time limit to the service. For example do not allow the users to access the service more often than one minute at a time (or whatever rate else is needed for the application to work properly).
See what Google and Amazon are doing for their web services. They allow you to contact them directly (even providing APIs to do this), but they impose limits on how many requests you can make.

I do this in PHP by declaring a variable in a file that's included everywhere, and then check if that variable is set in the ajax call file.
This way, you can't directly call the file ever because that variable will never have been defined.

This is the "non-trivial" way, hence it's not too elegant.
The only real idea I can think of is to keep track of every link. (as in everything does a postback and then a response.redirect). In this way you could keep a static List<> or something of IP addresses(and possible browser ID and such) that say which pages are allowed to be accessed at the moment from that visitor.. along with a time out for them and such to keep them from going straight to a page 3 days from now.
I recommend rethinking your design to be sure that this is really needed though. And also note IPs and such can be spoofed.
Also if you follow this route be sure to read up about when static variables get disposed and such. You wouldn't want one of those annoying "your session has expired" messages when they have been using the site for 10 minutes.