I am trying to implement the following design. Read data from a file (xml) at server startup and have it available as in memory variables to be used in the backend api for certain calculations. This data never changes thus it only need to be read once.
I am getting alot of module not found errors as I believe from what I read is that FS functions should only be done on the server side using getStaticProps
But this will trigger the read request every time a client loads the page.
Can someone guide me with a simple example on how to do this so that the data is read once and usable in the back end server side modules for calculations
Thanks
I have a Riak installation with many nodes. It stores entries, with relatively big blobs on a value side. In some cases, clients will need only a subset of this data, so there is no need to transfer them over the network.
Is there any elegant solution to pre-process this data on the server side, before actually sending them to client.
The only idea I have is to install a small "agent" on every node, that will interact with client in it's own protocol and acts as a proxy, that will reduce data based on query.
But such solution will work only, if I can know (based on key) on which node particular entry is stored. Is there a way to do it?
You may be able to do that with MapReduce, if you specify a single bucket/key as input. The map function would run local to where the data is stored, and send its result to the reduce function on whichever node received the request from the client. Since you are providing a specific key, there shouldn't be any coverage folding which is what causes the heavy load the docs warn about.
This would mean that your requests would effectively be using r=1, so if there is ever an outage you would get some false not found results.
This probably could not possibly be a more basic HTTP question, but I am very new to web development and I do not even know the right question to ask (evidenced by the fact that googling has not helped).
What I have: an AWS server with an Elastic Beanstalk environment set up. I have successfully compiled, uploaded, and run a simple "Hello World" program to the environment using Eclipse.
What I want to do: pass the server a number via HTTP request and have the server give me back an HTTP response containing the square of that number. On the back end, I want a simple Java class to do the squaring. (Of course, the goal is to be able to pass more complicated data to the server and have more sophisticated Java code on the back end for processing.)
What I think I need to do: create a Java Servlet to listen for and process the request. I think (hope) the documentation is good enough that I can figure out the HTTPServlet API, but I can't answer a more basic question: how do you pass an HTTP request containing some elementary data, like a number?
Thanks in advance!
You need to either GET, or POST (or PUT) your data. GET provides the data in the URL of the request, and will be displayed in the browser's address bar. POST data is provided as a separate request body.
http://www.w3schools.com/tags/ref_httpmethods.asp
A simple GET would look like this:
http://example.com/server?number=4
You can make a POST using a browser extension such as PostMan:
https://chrome.google.com/webstore/detail/postman-rest-client/fdmmgilgnpjigdojojpjoooidkmcomcm?hl=en
Or you can do it from the command line using curl:
curl -X POST http://example.com/server -d'data'
Once the data is more complicated than a few variables, you probably want to use POST rather than GET. Also, you can start to think about what your requests are doing. GETs should only retrieve data from the server. If you modify or create data, then POST (or PUT) requests are the methods to use.
As your server becomes more complex, you probably want to start reading about REST.
http://en.wikipedia.org/wiki/Representational_state_transfer
I've built an OData endpoint using a generic .ashx handler to run some SQL against a SQL Server database and format the payload using ODataLib 5.6. It appears to work as expected and I can view the results in a browser and can import the data into Excel 2013 successfully using the From OData Data Feed option on the Data ribbon.
However, I've noticed that Excel is actually issuing two GET requests when inspecting the HTTP traffic in Fiddler. This is causing some performance concerns since the SQL statement is being issued twice and the XML feed is being sent across the wire twice. The request headers look identical in both requests. The data is not duplicated in Excel. Is there a way to prevent the endpoint from being called multiple times by Excel? I can provide a code snippet or the Fiddler trace if needed.
My suggestion would be to use Power Query for this instead of ADO .Net.
The reason of raising the "duplicated" calls is that ADO .Net is not aware enough to identify the data at the first time. So it gets the schema back first, knowing the details about the data, and it can get and recognize the real data back with the second call. The first call is through the ADO.NET Provider GetSchema call, but that particular provider determines the schema by looking at the data.
I saw this previous post but I have not been able to adapt the answer to get my code to work.
I am trying to filter on the term bruins and need to reference cacert.pem since for authentication on my Windows machine. Lastly, I have written a function to parse each response (my.function) and need to include this as well.
postForm("https://stream.twitter.com/1/statuses/sample.json",
userpwd="user:pass",
cainfo = "cacert.pem",
a = "bruins",
write=my.function)
I am looking to stay completely within R and unfortunately need to use Windows.
Simply, how can I include the search term(s) that I want such that the response is filtered?
Thanks in advance.
Alright, so I've looked at what you're doing, and some of what you're working on may be helped by examining the Twitter API methods, although it can be difficult to figure out how to translate some of the examples into R (via the RCurl Package).
What you're currently trying is very close to what you need to do, you simply need to change two things.
First of all, you're querying the url for the random sample of statuses. This url returns a random sample of roughly 1% of all tweets.
If you're interested in collecting only tweets about specific keywords, you want to use the filter API url: "https://stream.twitter.com/1/statuses/filter.json"
After changing that, you simply need to change your parameter from "a" to "postfields", and the parameter you'd be passing would look like: "track=bruins"
Finally, you should use the getURL function, to open a continuous stream, so all tweets with your keywords can be collected, rather than using the postForm command (which I believe is intended for HTML forms).
so your final function call should look like the following:
getURL("https://stream.twitter.com/1/statuses/filter.json",
userpwd="Username:Password",
cainfo = "cacert.pem",
write=my.function,
postfields="track=bruins")
For manipulating twitter, use the twitteR package.
library(twitteR)
searchTwitter("bruins")
You can include other parameters (like cainfo) in the call to searchTwitter, and they should get passed getForm underneath.
I don't think the Streaming API is currently included in twitteR - the search api is different (it's backward looking, whereas streaming is "current looking").
From my understanding, streaming is quite different to how lots of APIs work typically work; rather than pulling data from a web service and having a defined object returned, you're setting up a "pipe" for Twitter to push data to you and you then listen for that response.
You also need to worry about OAuth I think (which twitteR does deal with).
Is there any reason that you want to keep in R? I've used python successfully with the Streaming API and a package called tweepy to write data to a MySQL database and then use R to query and analyse the data.
Last time I checked, twitteR did not talk to the streaming API. Moreover, as far as I know, very few publicly-available Twitter Streaming API connection libraries in any language honor Twitter's recommendations on reconnecting when Streaming disconnects / throws an error.
My recommendation is to access Streaming via a library that's actively maintained, write the re-connection protocol yourself if you have to, and persist the data into a database that handles JSON natively. I'm about to embark on a project of this nature and will be writing the collector in Perl, doing my own re-connect logic and persisting into either PostgreSQL or MongoDB. Most likely it will be MongoDB; PostgreSQL doesn't get native JSON till 9.2.
Late to the game, I know, but you'll want to use the "streamR" package for access to Twitter's streaming API.