GremlinServerError: 499 - gremlin

I am running a Neptune Server on AWS and making gremlin queries to the db ipython cell magic in an jupyter notebook. I've got a number of traversals running and I am getting an error that is coming from aiogoblin in their resultset.py file: GremlinServerError: 499: {"requestId":"5bb1e6ea-49ec-4a1d-9364-2b1bf717df9c","code":"InvalidParameterException","detailedMessage":"The [eval] message contains 66 bindings which is more than is allowed by the server 64 configuration"}
How can I make continued queries against the server without this error message popping up?

I believe there was a known issue with the client/magic you are using and I don't think it has been updated in four years or so. I vaguely remember you could work around it by doing something like %reset in the cell but I really think you would be better off using a different client that is regularly updated and supported.
You could instead use the Apache TinkerPop Gremlin Python client (pip install gremlinpython) or try the new Amazon Neptune Workbench which offers a %%gremlin cell magic.
If you use the Gremlin Python client in a Jupyter notebook you can still issue queries in much the same way, you would just need to establish a connection to the server in a cell before issuing Python based queries. There is a blog post that may be of interest located here [1] and a stand alone Python example you could use to create a cell containing the imports and setup steps can be found here [2] and here [3]. In the sample you would replace localhost with the DNS name of your Neptune endpoint.
If you decide to try the new Neptune Workbench you can create one from the AWS Neptune Console web page.
[1] https://aws.amazon.com/blogs/database/let-me-graph-that-for-you-part-1-air-routes/
[2] https://github.com/krlawrence/graph/blob/master/sample-code/basic-client.py
[3] https://github.com/krlawrence/graph/blob/master/sample-code/glv-client.py

Related

Maxscale "Capability mismatch"

I did a fresh install of Maxscale, and I was trying to set up a Read-Write-Split service on a master-slave mariadb cluster.
When I was trying to connect with DataGrip or DBeaver, I got the following error message:
[HY000][1927] Capability mismatch (bdd-master)
But when I use the mysql command line client, it works well.
Do you have any idea of what could be wrong?
MaxScale sends a Capability mismatch error when it detects that the client application requests a protocol capability that one of the backend databases cannot support. In general, this should not happen as MaxScale tries to mimic the backend database and calculates the capabilities so that these sort of mismatches do not happen.
There are some known bugs that can cause this, both in MaxScale as well as old versions of MariaDB and MySQL. Upgrading to the latest possible version of MaxScale should help solve any problems you might see.
Additionally, you should disable the query cache in the database if you are using MySQL as there is a bug in MySQL (and old MariaDB versions as well) that causes these sort of problems to appear.
It seems that is related to the router used (readwritesplit).
Datagrip send this command when it initiate the connection:
set autocommit=1, session_track_schema=1, sql_mode = concat(##sql_mode,',STRICT_TRANS_TABLES')
It seems that some of theses parameters are not supported by readwritesplit.

How to see what manufacturer owns a MAC address range/prefix

I am looking for a way to programmatically get the name of the vendor that owns a MAC address within a block/range that they purchased. Preferably by querying some API or database, language agnostic. Or if there is some other way that applications do it that I am unaware of.
For example, running nmap -sn 192.168.1.0/24 with root privileges yields
...
Nmap scan report for 192.168.1.111
Host is up (0.35s latency).
MAC Address: B8:27:EB:96:E0:0E (Raspberry Pi Foundation)
...
... and that tells me that the Raspberry Pi Foundation "owns" that MAC Address, within the prefix range that they own: B8:27:EB.
However, I am not sure how nmap knows this, nor how I could find this out myself. Parsing nmap output is not an ideal solution for me. Here's what I found from digging online:
This stackoverflow question references a site that appears to do this, however it appears to not have been updated since 2013, nor does it expose any API endpoints. Most notably, it does not have the newer block of MAC Addresses that the Raspberry Pi Foundation reserved for their newer models (under Raspberry Pi Team, or something along those lines).
I found that the IEEE handles these registrations through their site, however it appears to be for their customers and I could not find an exposed endpoint for their search function.
On that same IEEE page linked above, it looks like I can get a CSV file of their entire database. However that seems large, and would have to be actively kept up-to-date. Does nmap come with an updated database generated from those files locally?
If a public-facing API like I'm envisioning doesn't exist, I'll make one myself for fun. I'd first like to know if I'm thinking about this wrong and if there is an official, "canonical" way that I have not found. Any help would be appreciated, and thank you.
The maintainers of nmap keep a list of prefixes as part of the tool. You can see it here:
https://github.com/nmap/nmap/blob/master/nmap-mac-prefixes
They keep this up to date by periodically importing the public registry on this site:
https://regauth.standards.ieee.org/standards-ra-web/pub/view.html#registries
Note that those files are rate-limited so you should not be querying those csv files ad hoc as part of a software package; rather you should do what nmap does and keep an internal list that you synchronize periodically.
I'm not aware of a publicly available tool to query them as an API; however, creating one that works the same way that nmap does would be fairly trivial. nmap does not update that file more than once or twice a year which makes me suspect that the list doesn't significantly change often enough that keeping your own list would be too onerous (you could even download nmap's list every so often).

Why does Rexster show "titangraph[cassandra:null]" even though its connected?

When I connect to Titan via the Gremlin console it says... titangraph[cassandra:127.0.0.1]
Rexster however says... titangraph[cassandra:null] even though I can browse the same set of vertices.
Why is this? Rexster makes it look as though it hasn't managed to connect.
This message indicates that Cassandra did not start correctly.
Try starting Titan with the following:
titan.sh -c cassandra-es start
Have a look at /conf for additional configuration files.
Unless you really have to, I strongly suggest installing Titan 0.5.0 which comes with many useful features.
If you're starting with graph databases and Titan, I suggest trying with a single machine cluster or Berkeley DB as a storage backend. You may not need Cassandra yet.
You can also have a look at Titan/Aurelius official mailing list, I know the issue you're experiencing has been discussed there before: https://groups.google.com/forum/#!forum/aureliusgraphs. You can search for resources there (see, for example, https://groups.google.com/d/msg/aureliusgraphs/bviB6E5TZ-A/TJxQv0U7WQEJ).
In the meanwhile, you can try Titan v0.5.0 in Node.js by connecting via HTTP with https://github.com/gulthor/grex (HTTP client). The recommended way of connecting to Rexster is via HTTP (TinkerPop 2.x) or WebSocket (in upcoming TinkerPop 3, which Titan will support in a future version).

Machine's uptime in OpenStack

I would like to know (and retrieve via REST API) the uptime of individual VMs running in OpenStack.
I was quite surprised that OpenStack web UI has a colon called "Uptime" but it actually show time since the VM was created. If i stop the VM, the UI shows Status=Shutoff, Power State=Shutdown, but the Uptime is still being incremented...
Is there a "real" uptime (I mean for a machine that is UP)?
Can I retrieve it somehow via the OpenStack's REST API?
I saw the comment at How can I get VM instance running time in openstack via python API? but the page with the extension mentioned there does not exists and it looks to me that this extension will not be available in all OpenStack environment. I would like to have some standard way to retrieve the uptime.
Thanks.
(Version Havana)
I haven't seen any documentation saying this is the reason, but the nova-scheduler doesn't differentiate between a running and powered off instance. So your cloud can't be over-allocated or leave an instance in a position that would be unable to be powered on. I would like to see a metric of actual system runtime as well, but at the moment the only way to gather that would be through ceilometer or via Rackspaces StackTach

Using snow (and snowfall) with AWS for parallel processing in R

In relation to my earlier similar SO question , I tried using snow/snowfall on AWS for parallel computing.
What I did was:
In the sfInit() function, I provided the public DNS to socketHosts parameter like so
sfInit(parallel=TRUE,socketHosts =list("ec2-00-00-00-000.compute-1.amazonaws.com"))
The error returned was Permission denied (publickey)
I then followed the instructions (I presume correctly!) on http://www.imbi.uni-freiburg.de/parallel/ in the 'Passwordless Secure Shell (SSH) login' section
I just cat the contents of the .pem file that I created on AWS into the ~/.ssh/authorized_keys of the AWS instance I want to connect to from my master AWS instance and for the master AWS instance as well
Is there anything I am missing out ?
I would be very grateful if users can share their experiences in the use of snow on AWS.
Thank you very much for your suggestions.
UPDATE:
I just wanted to update the solution I found to my specific problem:
I used StarCluster to setup my AWS cluster : StarCluster
Installed package snowfall on all the nodes of the cluster
From the master node issued the following commands
hostslist <- list("ec2-xxx-xx-xxx-xxx.compute-1.amazonaws.com","ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com")
sfInit(parallel=TRUE, cpus=2, type="SOCK",socketHosts=hostslist)
l <- sfLapply(1:2,function(x)system("ifconfig",intern=T))
lapply(l,function(x)x[2])
sfStop()
The ip information confirmed that the AWS nodes were being utilized
Looks not that bad but the pem file is wrong. But it is sometimes not that simple and many people have to fight with this issues. A lot of tips you can find in this post:
https://forums.aws.amazon.com/message.jspa?messageID=241341
Or check google for other posts.
From my experience most people have problems in these steps:
Can you log onto the machines via ssh? (ssh ec2-00-00-00-000.compute-1.amazonaws.com). Try to use the public DNS, not the public IP to connect.
You should check your "Security groups" in AWS if the 22 port is open for all machines!
If you plan to start more than 10 worker machines you should work on a MPI installation on your machines (much better performance!)
Markus from cloudnumbers.com :-)
I believe #Anatoliy is correct: you're using an X.509 certificate. For the precise steps to take to add the SSH keys, look at the "Types of credentials" section of the EC2 Starters Guide.
To upload your own SSH keys, take a look at this page from Alestic.
It is a little confusing at first, but you'll want to keep clear which are your access keys, your certificates, and your key pairs, which may appear in text files with DSA or RSA.

Resources