How to use twitteR to request >150k followers ids? - r

I would like to use the twitteR package for R to get a the list of followers IDs of a user with more than 150k followers. Calling naïvely the getFollowersIDs() method of user-class breaks because it overloads the api request limit (Client Error (429)). Is there a way of circumventing this?

You can directly take the number of followers by "followersCount" field of the user-class. I'm not sure. Just try.

I am not sure how to do it if your bound to using R. I absolutely love the twitteR package, but recently, I have been shifting some of my code to python through the use of modules like tweepy. By no means do I intend to suggest that one language is better than the other, but when going beyond simple tasks, I have found python to be a tad more robust.

Related

Fastest way to send multiple http requests in R

You can use multithreading in Python and send lots of http requests, like in this SO question. My question is, is there any easy way to do this in R? I've seen a guide for RCurl here, but I'd prefer a simpler solution if possible. Currently I'm looping through a series of ids, it's be great to send all (or more) of them at once.
That guide to multiple requests in Rcurl looks pretty simple, in fact I'd say it looks simpler to me than the solution to the Python question you've linked. Better yet, the work is already done for you. Most of that guide is going into detail about the advantages of concurrent requests; the method itself is deceptively simple, and is provided for you pre-cooked right at the top of the page.
You can literally cut and paste the code shown at the top of the post into an R script (include library(RCurl) above it), run that code to source the function, then call the function with a single line.
I won't paste the function code here, since you should get that from its author, but once you've sourced that function, their example usage is:
uris = c("http://www.omegahat.org/index.html", "http://www.omegahat.org/RecentActivities.html")
z <- getURIs(uris)
I just did the above on my own computer, and it works perfectly. I'd be surprised if you can find a simpler solution than that.

Use Julia to perform computations on a webpage

I was wondering if it is possible to use Julia to perform computations on a webpage in an automated way.
For example suppose we have a 3x3 html form in which we input some numbers. These form a square matrix A, and we can find its eigenvalues in Julia pretty straightforward. I would like to use Julia to make the computation and then return the results.
In my understanding (which is limited in this direction) I guess the process should be something like:
collect the data entered in the form
send the data to a machine which has Julia installed
run the Julia code with the given data and store the result
send the result back to the webpage and show it.
Do you think something like this is possible? (I've seen some stuff using HttpServer which allows computation with the browser, but I'm not sure this is the right thing to use) If yes, which are the things which I need to look into? Do you have any examples of such implementations of web calculations?
If you are using or can use Node.js, you can use node-julia. It has some limitations, but should work fine for this.
Coincidentally, I was already mostly done with putting together an example that does this. A rough mockup is available here, which uses express to serve the pages and plotly to display results (among other node modules).
Another option would be to write the server itself in Julia using Mux.jl and skip server-side javascript entirely.
Yes, it can be done with HttpServer.jl
It's pretty simple - you make a small script that starts your HttpServer, which now listens to the designated port. Part of configuring the web server is that you define some handlers (functions) that are invoked when certain events take place in your app's life cycle (new request, error, etc).
Here's a very simple official example:
https://github.com/JuliaWeb/HttpServer.jl/blob/master/examples/fibonacci.jl
However, things can get complex fast:
you already need to perform 2 actions:
a. render your HTML page where you take the user input (by default)
b. render the response page as a consequence of receiving a POST request
you'll need to extract the data payload coming through the form. Data sent via GET is easy to reach, data sent via POST not so much.
if you expose this to users you need to setup some failsafe measures to respawn your server script - otherwise it might just crash and exit.
if you open your script to the world you must make sure that it's not vulnerable to attacks - you don't want to empower a hacker to execute random Julia code on your server or access your DB.
So for basic usage on a small case, yes, HttpServer.jl should be enough.
If however you expect a bigger project, you can give Genie a try (https://github.com/essenciary/Genie.jl). It's still work in progress but it handles most of the low level work allowing developers to focus on the specific app logic, rather than on the transport layer (Genie's author here, btw).
If you get stuck there's GitHub issues and a Gitter channel.
Try Escher.jl.
This enables you to build up the web page in Julia.

Injecting code to track events on Delphi

I have a big and old application written in Delphi version 2007 for over a decade now and in order to rewrite it I intend to understand which parts/features of it are mostly used by the majority of its users.
The idea that came up is to track objects clicks and window creations to populate a log or analytics tool like Google Analytics or Deskmetrics with quantitative and qualitative data in order to help on the decision making.
To achieve that, I'm trying to figure out what is the easiest path respecting the current version limitations. One of the possibilities I am exploring is understanding how to implement some generic code that could be somehow "injected/reflected" in a class level so all instantiated objects may among other things, call a function passing parameters that identify themselves and this function would then take action to log that info using the best tool.
The only real solution so far is copying and pasting this function call on several thousands of onClick/onCreate methods and I'd like to avoid it while I am open to all other possibilities that may come out on this thread.
Thanks!

Performance of large collection in Meteor 1.0.X

There has been a LOT of development in the Meteor world, and as such it's getting hard to find answers that work for current versions due to the plethora of answers you find for old, out-dated versions.
I have an app that has a LOT of data in a particular collection. By lots I mean somewhere between 10k-100k, and very potentially a lot more. Essentially it's log data, and I need to display the results in a table with no pagination (like a tail). In researching ways to optimize large collections I keep running into things like this that seem to be for older versions of Meteor.
So, as I see it my options are:
Use fast-render plugin to display the page prior to the subscription (at least this is my understand on how it works).
Use some sort of progressive publish function, where it loads limited more relevant bits of data first, then progressively loads the remaining data by expanding the window/limit (not sure if this would cause heavier load on the server, though). There seems to have been a "progressive publish" plugin, but it doesn't seem to be under active development any longer.
Optimize the lookups via indexing (How do you specify that when creating the collection???)
Profiling and optimizing the template further (not sure how).
Some other method I haven't thought of yet...
Some combination of all-the-above.
What is the proper approach by which to publish and render lots of data in this way?
I'm going to assume that "optimize" means reduced query time.
Always start with the biggest bang for your buck.
Unless you're publishing the entire collection, or query on the _id, then you want to create an index using _ensureIndex. Get more info on this on the mongodb website or by searching other questions. http://docs.mongodb.org/manual/reference/method/db.collection.ensureIndex/
Second, limit the fields to just the info you need. eg {fields: {a:1, b:1}}. http://docs.meteor.com/#/full/fieldspecifiers
Third, don't sort.
If this still isn't good enough, make another question with schema & query details & the desired UI so we can better understand the reactivity and why you can't use some form of pagination.

Recommended method to download tweets based on search terms and store

I would like to download tweets based on certain search terms. I'm aware of HTTP GET and such techniques, but I'm not sure the best way to create a simple executable that downloads the tweets and saves them for subsequent analysis.
Any ideas? I'm a basic programmer - if you say "use curl" I know roughly what you mean but not how to set up an application to run curl commands!
Hence my dilemna.
Thanks in advance!
You absolutely can do it in c# or any other language.
From a very rudimentary standpoint, the Twitter API wiki will tell you how, but I know that's not what you're really asking.
I would suggest getting familiar with a good API such as Tweetsharp which also has methods not only for getting your typical timelines, but also using search. The advantage to this (aside from not having to handle your own serialization, etc.) is that it unifies the timeline and search calls as they are actually slightly different API's.
The downside to this approach though is that you're not going to be able to directly translate it to a mac, unless you write it using Silverlight.
the upside to this approach is that Tweetsharp gives you a number of options on how it gives you the data, which in turn gives you a number of options as to how to save the data.

Resources