Indefinite Provisioning of EMR Cluster with Segue in R - r

I am trying to use the R package called Segue by JD Long, which is lauded as the ultimate in simplicity for using R with AWS by a book I read called "Parallel R".
However, for the 2nd day in a row I've run into a problem where I initiate the creation of a cluster and it just says STARTING indefinitely.
I tried this on OS X and in Linux with clusters of sizes 2, 6, 10, 20, and 25. I let them all run for at least 6 hours. I have no problem starting a cluster in the AWS EMR Management Console, though I have no clue how to connect Segue/R to a cluster that was started in the Management Console instead of via createCluster().
So my question is -- is there either some way to trouble shoot the provisioning of the cluster or to bypass the problem by creating the cluster manually and somehow getting Segue to work with that?
Here's an example of what I'm seeing:
library(segue)
Loading required package: rJava
Loading required package: caTools
Segue did not find your AWS credentials. Please run the setCredentials() function.
setCredentials("xxx", "xxx")
emr.handle <- createCluster(numInstances=10)
STARTING - 2013-07-12 10:36:44
STARTING - 2013-07-12 10:37:15
STARTING - 2013-07-12 10:37:46
STARTING - 2013-07-12 10:38:17
.... this goes on for hours and hours and hours...
UPDATE##: After 36 hours and many attempts that failed, this began working (randomly...) when I tried it with 1 node. I then tried it with 10 nodes and it worked great. To my knowledge nothing changed locally or on AWS...

I am answering my own question on behalf of the AWS support rep who gave me the following belated explanation:
The problem with the EMR creation is with the Availability Zone specified (us-east-1c), this availability zone is now constrained and doesn't allow the creation of new instances, so the job was trying to create the instances in a infinite loop.
You can see information about constrained AZ here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-regions-availability-zones
"As Availability Zones grow over time, our ability to expand them can become constrained. If this happens, we might restrict you from launching an instance in a constrained Availability Zone unless you already have an instance in that Availability Zone. Eventually, we might also remove the constrained Availability Zone from the list of Availability Zones for new customers. Therefore, your account might have a different number of available Availability Zones in a region than another account."
So you need to specify another AZ, or what I recommend is not specify any AZ, so EMR is going to be able to select any available.
I found this thread: https://groups.google.com/forum/#!topic/segue-r/GBd15jsFXkY
on Google Groups, where the topic of availability zones came up before. The zone that was set as the new default in that thread was the zone causing problems for me. I am attempting to edit the source of Segue.

Jason, I'm the author of Segue so maybe I can help.
Please look under the details section in the lower part of the AWS console and see if you can determine if the bootstrap sequences completed. This is an odd problem because typically an error at this stage is pervasive across all users. However I can't reproduce this one.

Related

Azure Machine Learning throws error "Invalid graph: You have invalid compute target(s) in node(s)" while running the pipeline

I am facing a strange issue while dealing with Azure Machine Learning (Preview) interface.
I have designed a training pipeline, which was getting initiated on certain compute node (2 node cluster, with minimal configurations). However, it used to take lot of time for execution. So, I tried to create a new training cluster (8 node cluster, with higher config). During this process, I had created and deleted some of the training clusters.
But, strangely, since then, while submitting the pipeline I am getting error as "Invalid graph: You have invalid compute target(s) in node(s)".
Could you please advise on this situation.
Thanks,
Mitul
I bet this was pretty frustrating. A common debugging strategy I have is to delete compute targets and create new ones. Perhaps this was another "transient" error?
The issue should have been fixed and will be rolled out soon. Meanwhile, as a temporary solution, you can refresh the page to make it work.

bokeh sample data download fail with 'HTTPError: HTTP Error 403: Forbidden'

When trying to download bokeh sample data following instructions in 'https://docs.bokeh.org/en/latest/docs/installation.html#sample-data' it fails to download with HTTP Error 403: Forbidden
in conda prompt:
bokeh sampledata (failed)
in Jupyter notebook
import bokeh.sampledata
bokeh.sampledata.download() (failed
TLDR; you will either need to upgrade to Bokeh version 1.3 or later, or else you can manually edit the bokeh.util.sampledata module to use the new CDN location http://sampledata.bokeh.org. You can see the exact change to make in PR #9075
The bokeh.sampledata module originally pulled data directly from a public AWS S3 bucket location hardcoded in the module. This was a poor choice that left open the possibility for abuse, and in late 2019 an incident finally happened where someone (intentionally or unintentionally) downloaded the entire dataset tens of thousands of times over a three day period, incurring a significant monetary cost. (Fortunately, AWS saw fit to award us a credit to cover this anomaly.) Starting in version 1.3 sample data is now only accessed from a proper CDN with much better cost structures. All public direct access to the original S3 bucket was removed. This change had the unfortunate effect of immediately breaking bokeh.sampledata for all previous Bokeh versions, however as an open-source project we simply cannot afford the real (and potentially unlimited) financial risk exposure.

How to display the logged information on an aerospike server as a graph?

I would like to log some stats on some aerospike nodes and analyse the same.
I found that aerospike comes with this tool called asgraphite, which seems to be using a forked version of the
The integration guide of asgraphite mentions some commands which are supposed to, e.g., start logging. I can run the following command already on my node and see the expected output, so it looks like I am all set to start logging -
By the way, we are running the community edition, which, it seems, does not provide historical latency stats in the AMC dashboard.
python /opt/aerospike/bin/asgraphite --help
However, I don't see any information on how to monitor the data hence logged. I am expecting a web interface which is usually provided by graphite, like this -

Googleway timeout

I'm having trouble with the googleway package in R.
I am attempting to get driving distance for 159,000 records.
I am using a paid google cloud account and have set all quotas as unlimited.
I've attempted to use server keys and browser keys.
After multiple attempts the service returns a time out message
Error in open.connection(con, "rb") : Timeout was reached
Successfully returned x results before timeout
1) x ~=5,000 2) x ~=7,000 3) x ~=3,000 4) x ~= 12,000
All tried on different days.
As you can see none of these are any where near the 100,000/day quota.
We've checked firewall rules and made sure that the cause of the time out is not at our end.
For some reason Google API service is cutting off the requests.
We have had no response from Google and we are currently on the bronze support package so we don't get any real support from them as a matter of course.
The creator of the googleway packages is certain that there are no impediments coming from the package.
We're hoping there is someone out there who may know why this may be happening and how we could avoid it from happening to enable us to run the distance matrix over our full list of addresses.
Using R version "Supposedly Educational".
Using Googleway package.
CHARSET cp1252
DISPLAY :0
FP_NO_HOST_CHECK
NO
GFORTRAN_STDERR_UNIT
-1
GFORTRAN_STDOUT_UNIT
-1
NUMBER_OF_PROCESSORS
4
OS Windows_NT
PROCESSOR_ARCHITECTURE
AMD64
PROCESSOR_IDENTIFIER
Intel64 Family 6 Model 60 Stepping
3, GenuineIntel
PROCESSOR_LEVEL 6
PROCESSOR_REVISION
3c03
R_ARCH /x64
R_COMPILED_BY gcc 4.9.3
RS_LOCAL_PEER \\.\pipe\37894-rsession
RSTUDIO 1
RSTUDIO_SESSION_PORT
37894
I have developed a different implementation between Google maps and R:
install.packages("gmapsdistance")
You can try this one. However, take into account that in addition to the daily limits, there are limits to the number of queries even if you have the premium account (625 per request, 1,000 per second in the server side, etc.):
https://developers.google.com/maps/documentation/distance-matrix/usage-limits I think this might be the issue

Openstack Horizon Fetch Instances

I would like to fetch all instances and calculate the uptime of the vcpu ram etc.
upon checking out the existing horizon code which sits at openstack_dashboard/dashboard/usage
usage = api.nova.usage_get(self.request, self.tenant_id, start, end)
I have searched the internet for a documentation for it but i am so unlucky to find one. ,
I would like to know what will be in the arguments start and end.
Thanks, I hope someone who could lead me into this.
You can debug your horzion with ./manage.py shell

Resources