Why is the NebulaGraph tag index not rebuilt? - nebula-graph

I created a tag index and tried to rebuild it in the Nebula Graph database, but it failed. The log shows This space is building index.
I ran show jobs, and there were no running jobs.
std::vector<AdminSubTask> tasks;
for (auto it = env_->rebuildIndexGuard_->cbegin();
it != env_->rebuildIndexGuard_->cend(); ++it) {
if (std::get<0>(it->first) == space_ && it->second != IndexState::FINISHED) {
LOG(ERROR) << "This space is building index";
return cpp2::ErrorCode::E_REBUILD_INDEX_FAILED;
}
}
The following is the log when the index is not rebuilt.
E1226 10:44:49.166803 424281 UpdateNode.h:183] vertex conflict 169:96:171:0dbb49c96ba38d8d49a9cccca0de9e28
E1226 10:44:52.144032 424285 UpdateNode.h:183] vertex conflict 169:13:171:5b09d9d36aa2d8e5d9c5e15d3c1d8245
E1226 10:47:46.856619 265937 Host.cpp:348] [Port: 9780, Space: 128, Part: 233] [Host: 172.23.236.100:9990] Failed to append logs to the host (Err: E_TERM_OUT_OF_DATE)
E1226 10:49:56.923763 424242 RebuildIndexTask.cpp:63] This space is building index
E1226 10:49:56.923795 424242 AdminTaskManager.cpp:254] job 183, genSubTask failed, err=E_REBUILD_INDEX_FAILED
And lots of the following errors in the log
Failed to append logs to the host (Err: E_TERM_OUT_OF_DATE)
E1226 11:19:29.156589 424281 UpdateNode.h:183] vertex conflict 169:21:170:13b689b5cb0c1c415849c250708e4f5e
E1226 11:19:29.284868 424280 UpdateNode.h:183] vertex conflict 169:62:170:2146dc6004a8200ed3c1cca1a7ae7008
E1226 11:19:31.090054 265903 AddVerticesProcessor.cpp:164] The vertex locked : tag 170, vid 3dd7a92570f73d4c8541a6478bbecd65
E1226 11:19:32.039448 424280 UpdateNode.h:183] vertex conflict 169:13:170:1b3b3f5d0387352399feefd670b4c587
E1226 11:19:32.039489 424281 UpdateNode.h:183] vertex conflict 169:13:170:1b3b3f5d0387352399feefd670b4c587
E1226 11:19:32.143121 424282 UpdateNode.h:183] vertex conflict 169:2:170:2290a0580ffec7a1bb30d86dcd366571
Can you help me with this issue?

This is a known issue in nebulagraph v2.6, and it was fixed in version 3.x.
I restarted the storage service where the tag index failed to be rebuilt and it worked.

Related

New eks node instance not able to join cluster, getting "cni plugin not initialized"

I am pretty new to terraform and trying to create a new eks cluster with node-group and launch template. The EKS cluster, node-group, launch template, nodes all created successfully. However, when I changed the desired size of the node group (using terraform or the AWS management console), it would fail. No error reported in the Nodg group Health issues tab. I digged further, and found that new instances were launched by the Autoscaling group, but new ones were not able to join the cluster.
Look into the troubled instances, I found the following log by running "sudo journalctl -f -u kubelet"
an 27 19:32:32 ip-10-102-21-129.us-east-2.compute.internal kubelet[3168]: E0127 19:32:32.612322 3168 eviction_manager.go:254] "Eviction manager: failed to get summary stats" err="failed to get node info: node "ip-10-102-21-129.us-east-2.compute.internal" not found"
Jan 27 19:32:32 ip-10-102-21-129.us-east-2.compute.internal kubelet[3168]: E0127 19:32:32.654501 3168 kubelet.go:2427] "Error getting node" err="node "ip-10-102-21-129.us-east-2.compute.internal" not found"
Jan 27 19:32:32 ip-10-102-21-129.us-east-2.compute.internal kubelet[3168]: E0127 19:32:32.755473 3168 kubelet.go:2427] "Error getting node" err="node "ip-10-102-21-129.us-east-2.compute.internal" not found"
Jan 27 19:32:32 ip-10-102-21-129.us-east-2.compute.internal kubelet[3168]: E0127 19:32:32.776238 3168 kubelet.go:2352] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Jan 27 19:32:32 ip-10-102-21-129.us-east-2.compute.internal kubelet[3168]: E0127 19:32:32.856199 3168 kubelet.go:2427] "Error getting node" err="node "ip-10-102-21-129.us-east-2.compute.internal" not found"
Looked like the issue has something to do with the cni add-ons, googled it and others suggest to check for the log inside the /var/log/aws-routed-eni directory. I could find that directory and logs in the working nodes (the ones created initialy when the eks cluster was created), but the same directory and log files do not exist in the newly launch instances nodes (the one created after the cluster was created and by changing the desired node size)
The image I used for the node-group is ami-0af5eb518f7616978 (amazon/amazon-eks-node-1.24-v20230105)
Here is what my script looks like:
resource "aws_eks_cluster" "eks-cluster" {
name = var.mod_cluster_name
role_arn = var.mod_eks_nodes_role
version = "1.24"
vpc_config {
security_group_ids = [var.mod_cluster_security_group_id]
subnet_ids = var.mod_private_subnets
endpoint_private_access = "true"
endpoint_public_access = "true"
}
}
resource "aws_eks_node_group" "eks-cluster-ng" {
cluster_name = aws_eks_cluster.eks-cluster.name
node_group_name = "eks-cluster-ng"
node_role_arn = var.mod_eks_nodes_role
subnet_ids = var.mod_private_subnets
#instance_types = ["t3a.medium"]
scaling_config {
desired_size = var.mod_asg_desired_size
max_size = var.mod_asg_max_size
min_size = var.mod_asg_min_size
}
launch_template {
#name = aws_launch_template.eks_launch_template.name
id = aws_launch_template.eks_launch_template.id
version = aws_launch_template.eks_launch_template.latest_version
}
lifecycle {
create_before_destroy = true
}
}
resource "aws_launch_template" "eks_launch_template" {
name = join("", [aws_eks_cluster.eks-cluster.name, "-launch-template"])
vpc_security_group_ids = [var.mod_node_security_group_id]
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = var.mod_ebs_volume_size
volume_type = "gp2"
#encrypted = false
}
}
lifecycle {
create_before_destroy = true
}
image_id = var.mod_ami_id
instance_type = var.mod_eks_node_instance_type
metadata_options {
http_endpoint = "enabled"
http_tokens = "required"
http_put_response_hop_limit = 2
}
user_data = base64encode(<<-EOF
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="
--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
set -ex
exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1
B64_CLUSTER_CA=${aws_eks_cluster.eks-cluster.certificate_authority[0].data}
API_SERVER_URL=${aws_eks_cluster.eks-cluster.endpoint}
K8S_CLUSTER_DNS_IP=172.20.0.10
/etc/eks/bootstrap.sh ${aws_eks_cluster.eks-cluster.name} --apiserver-endpoint $API_SERVER_URL --b64-cluster-ca $B64_CLUSTER_CA
--==MYBOUNDARY==--\
EOF
)
tag_specifications {
resource_type = "instance"
tags = {
Name = "EKS-MANAGED-NODE"
}
}
}
Another thing I notice is that I tagged the instance Name as "EKS-MANAGED-NODE". That tag showed up correctly in nodes created when the eks cluster was created. However, any new nodes created afterward, the Name changed to "EKS-MANAGED-NODEGROUP-NODE"
I wonder if that indicates there is issue?
I checked the log confirmed that the user-data got looked at and ran when instances started up.
sh-4.2$ more user-data.log
B64_CLUSTER_CA=LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJek1ERXlOekU
0TlRrMU1Wb1hEVE16TURFeU5E (deleted the rest)
API_SERVER_URL=https://EC283069E9FF1B33CD6C59F3E3D0A1B9.gr7.us-east-2.eks.amazonaws.com
K8S_CLUSTER_DNS_IP=172.20.0.10
/etc/eks/bootstrap.sh dev-test-search-eks-oVpBNP0e --apiserver-endpoint https://EC283069E9FF1B33CD6C59F3E3D0A1B9.gr7.us-east-2.eks.amazonaws.com --b64-cluster-ca LS0tLS
1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakND...(deleted the rest)
Using kubelet version 1.24.7
true
Using containerd as the container runtime
true
‘/etc/eks/containerd/containerd-config.toml’ -> ‘/etc/containerd/config.toml’
‘/etc/eks/containerd/sandbox-image.service’ -> ‘/etc/systemd/system/sandbox-image.service’
Created symlink from /etc/systemd/system/multi-user.target.wants/containerd.service to /usr/lib/systemd/system/containerd.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/sandbox-image.service to /etc/systemd/system/sandbox-image.service.
‘/etc/eks/containerd/kubelet-containerd.service’ -> ‘/etc/systemd/system/kubelet.service’
Created symlink from /etc/sy
I confirmed that the role being specified has all the required permission, the role is being used in other eks cluster, I am trying to create a new one based on the existing one using terraform.
I tried removing the launch template and let aws using the default one. Then new nodes have no issue joining the cluster.
I looked at my launch template script and at the registry https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/launch_template
nowhere mentioned that I need to manually add or run the cni plugin.
So I don't understand why the cni plugin was not installed automatically and why instances are not able to join the cluster.
Any help is appreciated.

bus error on usage of rusqlite with spatialite extension

I'm seeing a bus error on cargo run when attempting to load the spatialite extension with rusqlite:
Finished dev [unoptimized + debuginfo] target(s) in 1.19s
Running `target/debug/rust-spatialite-example`
[1] 33253 bus error cargo run --verbose
My suspicion is that there's a mismatch of sqlite version and spatialite and that they need to be built together rather than using the bundled feature of rusqlite, though it seems like that'd result in a different error?
Here's how things are set up:
Cargo.toml
[package]
name = "rust-spatialite-example"
version = "0.0.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
rusqlite = { version = "0.28.0", features = ["load_extension", "bundled"] }
init.sql
CREATE TABLE place (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
SELECT AddGeometryColumn('place', 'geom', 4326, 'POINT', 'XY', 0);
SELECT CreateSpatialIndex('place', 'geom');
main.rs
use rusqlite::{Connection, Result, LoadExtensionGuard};
#[derive(Debug)]
struct Place {
id: i32,
name: String,
geom: String,
}
fn load_spatialite(conn: &Connection) -> Result<()> {
unsafe {
let _guard = LoadExtensionGuard::new(conn)?;
conn.load_extension("/opt/homebrew/Cellar/libspatialite/5.0.1_2/lib/mod_spatialite", None)
}
}
fn main() -> Result<()> {
let conn = Connection::open("./geo.db")?;
load_spatialite(&conn)?;
// ... sql statements that aren't executed
Ok(())
}
Running:
cat init.sql | spatialite geo.db
cargo run
The mod_spatialite path is correct (there's an expected SqliteFailure error when that path is wrong). I tried explicitly setting sqlite3_modspatialite_init as the entry point and the behavior stayed the same.

Edges collection undefined until _collections() operation is used

I'm using ArangoDB 3.4.2 and I have a weird problem that I'm not able to explain...
I create a graph (myGraph) in the following in arangosh:
var graph_module = require('#arangodb/general-graph');
var myGraph = graph_module._create('mygraph');
myGraph._addVertexCollection('vertexes');
var edges = graph_module._relation('edges', ['vertexes'], ['vertexes']);
myGraph._extendEdgeDefinitions(edges);
Being vertexes and edges the collections for vertexes and edges, respectively.
Now, I create two vertexes:
db.vertexes.save({"name": "A", "_key": "A"});
db.vertexes.save({"name": "B", "_key": "B"});
So far so good. But now I try to create the edge between both and I get a fail:
127.0.0.1:8529#myDB> db.edges.save("vertexes/A", "vertexes/B", {"name": "A-to-B"});
JavaScript exception: TypeError: Cannot read property 'save' of undefined
!db.edges.save("vertexes/A", "vertexes/B", {"name": "A-to-B"});
! ^
stacktrace: TypeError: Cannot read property 'save' of undefined
at <shell command>:1:9
It seems that db.edges is undefined:
127.0.0.1:8529#MyDB> console.log(db.edges)
2019-01-26T19:01:52Z [98311] INFO undefined
But now, if I run db._collections() it seems that db.edges gets defined (weird!)
127.0.0.1:8529#MyDB> db._collections()
...
127.0.0.1:8529#MyDB> console.log(db.edges)
2019-01-26T19:02:58Z [98311] INFO [ArangoCollection 16807, "edges" (type edge, status loaded)]
and in this moment, the db.edges.save(...) operation works:
127.0.0.1:8529#MyDB> db.edges.save("vertexes/A", "vertexes/B", {"name": "A-to-B"});
{
"_id" : "edges/16899",
"_key" : "16899",
"_rev" : "_YGsKKq2--_"
}
Why db.edges is undefined at the first save()? Why a show colletions operation (which I understand is read-only) is getting it defined? Maybe I'm doing something wrong?
When executing db.edges.save() an internal cache is accessed. If this cache is clear, executing db.edges.save() works to save an edge. Since db._collections() resets this cache, it is possible to run the command afterwards. However if this cache is not clear, an error is thrown as you observed.
The correct and safe way is to access the collection via db._collection("collection-name").
Therefore you can use the following command to save an edge in the edges collection:
db._collection("edges").save("vertexes/A", "vertexes/B", {"name": "A-to-B"});

Resource 7bed8adc-9ed9-49dc-b15e-6660e2fc3285 transitioned to failure state ERROR when use openstacksdk to create_server

When I create the openstack server, I get bellow Exception:
Resource 7bed8adc-9ed9-49dc-b15e-6660e2fc3285 transitioned to failure state ERROR
My code is bellow:
server_args = {
"name":server_name,
"image_id":image_id,
"flavor_id":flavor_id,
"networks":[{"uuid":network.id}],
"admin_password": admin_password,
}
try:
server = user_conn.conn.compute.create_server(**server_args)
server = user_conn.conn.compute.wait_for_server(server)
except Exception as e: # there I except the Exception
raise e
When create_server, my server_args data is bellow:
{'flavor_id': 'd4424892-4165-494e-bedc-71dc97a73202', 'networks': [{'uuid': 'da4e3433-2b21-42bb-befa-6e1e26808a99'}], 'admin_password': '123456', 'name': '133456', 'image_id': '60f4005e-5daf-4aef-a018-4c6b2ff06b40'}
My openstacksdk version is 0.9.18.
In the end, I find the flavor data is too big for openstack compute node, so I changed it to a small flavor, so I create success.

Index state never change to ENABLED on Titan with Amazon DynamoDB backend

I'm trying to use composite index on DynamoDB and the index never switches from from INSTALLED to REGISTERED state.
Here is the code I used to create it
graph.tx().rollback(); //Never create new indexes while a transaction is active
TitanManagement mgmt=graph.openManagement();
PropertyKey propertyKey=getOrCreateIfNotExist(mgmt, "propertyKeyName");
String indexName = makePropertyKeyIndexName(propertyKey);
if (mgmt.getGraphIndex(indexName)==null) {
mgmt.buildIndex(indexName, Vertex.class).addKey(propertyKey).buildCompositeIndex();
mgmt.commit();
graph.tx().commit();
ManagementSystem.awaitGraphIndexStatus(graph, indexName).status(SchemaStatus.REGISTERED).call();
}else {
mgmt.rollback();
}
A sample of the log is:
...
...
612775 [main] INFO
com.thinkaurelius.titan.graphdb.database.management.GraphIndexStatusWatcher
- Some key(s) on index myIndex do not currently have status REGISTERED: type=INSTALLED 613275 [main] INFO
com.thinkaurelius.titan.graphdb.database.management.GraphIndexStatusWatcher
- Some key(s) on index typeIndex do not currently have status REGISTERED: type=INSTALLED 613275 [main] INFO
com.thinkaurelius.titan.graphdb.database.management.GraphIndexStatusWatcher
- Timed out (PT1M) while waiting for index typeIndex to converge on status REGISTERED
Waiting for a longer time does the trick. Example:
ManagementSystem.awaitGraphIndexStatus(graph, propertyKeyIndexName)
.status(SchemaStatus.ENABLED)
.timeout(10, ChronoUnit.MINUTES) // set timeout to 10 min
.call();

Resources