My cluster is composed of 3 linux_x64 servers. It contains 1 controller node and 3 data nodes. The server version of DolphinDB is v2.0.1.1.
dfsReplicationFactor=2
dataSync=1
The schema of the database is:
2021.09.09 08:42:57.180: execution was completed [34ms]
partitionSchema->[2021.09.06,2021.09.05,2021.09.04,2021.09.03,2021.09.02,2021.09.01,2021.08.31,2021.08.30,...]
databaseDir->dfs://dwd
engineType->OLAP
partitionSites->
partitionTypeName->VALUE
partitionType->1
When I insert data to the database “dfs://dwd”, I get an error:
Failed to add new value partitions to database dfs://dwd.Please manaually add new partitions [2021.09.07].
Then I use the following script to manually add partitions:
db=database("dfs://dwd")
addValuePartitions(db,2021.09.07..2021.09.09)
The error is:
<ChunkInRecovery>openChunks failed on '/dwd/domain', chunk cf57375e-b4b3-dc87-9b41-667a5e91a757 is in RECOVERING state
The repair method is shown as follows:
Step 1: use etClusterChunksStatus to get chunkid of `/dwd/domain' at the controller node. The sample cade is shown following:
select * from rpc(getControllerAlias(), getClusterChunksStatus) where file like "%/domain%" and state != 'COMPLETE'
Step 2: use getAllChunks to get the partition information for that chunkid at the data node. In the code below, The chunkid "4503a64f-4f5f-eea4-4247-a0d0fc3941a1" is obtained by step 1.
select * from pnodeRun(getAllChunks) where chunkId="4503a64f-4f5f-eea4-4247-a0d0fc3941a1"
Step 3: Use copyReplicas to copy the partition copy. Assuming that the result of step 2 shows that the partition copy is on datanode3, now copy to datanode1:
rpc(getControllerAlias(), copyReplicas{`datanode3, `datanode1, "4503a64f-4f5f-eea4-4247-a0d0fc3941a1"})
Step 4: use getClusterChunksStatus to check if the status is COMPLETE. If it is, then the repair is successful.
Related
So I am trying to build a python application that manages a directed temporal graph (so edges happen at a time T). I am using DyNetx with the function to read the graph from a Dataset where every row is a 3-tuple that has: startNode, endNode, timestamp.
Problem is, I wanted to store every single node as Node type object which is a class i created with attributes id and status since the focus of the program is to study influence maximization. I get the id from the file and I initialize every node with status 0. But when at some point in the code I change the status of the code, if that node appears later in another edge of the graph its status is resetted to 0, i think its because with custom node types there is no way to check if a node already exists..so my question is, is there a way to change the function that reads the temporal graph from the file so that if the same id occurrs twice the node associated is the same and there are no other objects created? or a way to give a status attribute to a node when rapresenting it as an int?
code below
import dynetx
```
class Node:
def __init__(self, id):
self.id = int(id)
self.status = 0
```
g = dynetx.read_snapshots('college.txt', nodetype=Node, timestamptype=int)
for e in g.stream_interactions():
print(e[0].id)
print(e[0].status)
```
I have parent package designed in such a way that
step 1: Data flow task fetches the list of configuration from db and stores in recordset.
step 2: for each loop having child package being called with it's project parameters set with values from previous task.
Now the scenario: if I get 2 records from step 1 then step 2 executes 2 times sequentially. How to execute the step 2 parallely for 2 configurations fetched from step1.
After googling for some amount of time, I copied child execute package task 2 times inside for each loop but the values assignment from previous task to child package parameter is not mapping correctly. Please advise what I am doing incorrectly
Note: I am looking for workaround on Execute package task which calls the copies of child package inside for each loop of parent package to run asynchronously. The child package is generic and requires parameter binding where variable values from parent package is assigned to child package parameter values.So, each copy of child package should be able to fetch different variable value from the list and do parameter binding. Please let me know if this is possible.
Execute package task parameter bindings
In your foreach loop add an execute SQL and add this code and fill the correct info and parameters...
DECLARE #execution_id BIGINT
EXEC [SSISDB].[catalog].[create_execution]
#package_name=N'Package1.dtsx'
, #project_name=N'Project1'
, #folder_name=N'Folder1'
, #use32bitruntime=False
, #reference_id=NULL
, #execution_id=#execution_id OUTPUT
-- Execute the package
EXEC [SSISDB].[catalog].[start_execution] #execution_id
This will kick off the package and NOT wait.
If you have package parameters to set, add this (between create and start) foreach one:
EXEC [SSISDB].[catalog].[set_execution_parameter_value]
#execution_id
, #object_type = 30 -- package parameter
, #parameter_name = N'ParameterName'
, #parameter_value = 1
I have a simple folder tree in Azure Data Lake Gen 2 that is partitioned by date with the following standard folder structure: {yyyy}/{MM}/{dd}. e.g. /Container/folder1/sub_folder/2020/11/01
In each leaf folder, I have some CSV files with few columns but without a timestamp (as the date is already embedded in the folder name).
I am trying to create an ADX external table that will include a virtual column of the date, and then query the data in ADX by date (this is a well-known pattern in Hive and Big data in general).
.create-or-alter external table TableName (col1:double, col2:double, col3:double, col4:double)
kind=adl
partition by (Date:datetime)
pathformat = ("/date=" datetime_pattern("year={yyyy}/month={MM}/day={dd}", Date))
dataformat=csv
(
h#'abfss://container#datalake_name.dfs.core.windows.net/folder1/subfolder/;{key}'
)
with (includeHeaders = 'All')
Unfortunately querying the table fails, and show artifacts return an empty list.
external_table("Table Name")
| take 10
.show external table Walmart_2141_OEE artifacts
with the following exception:
Query execution has resulted in error (0x80070057): Partial query failure: The parameter is incorrect. (message: 'path2
Parameter name: Argument 'path2' failed to satisfy condition 'Can't append a full path': at Concat in C:\source\Src\Common\Kusto.Cloud.Platform\Utils\UriPath.cs: line 25:
I tried to follow many types of pathformats and datetime_pattern as described in the documentation but nothing worked.
Any ideas?
According to your description the following definition should work:
.create-or-alter external table TableName (col1:double, col2:double, col3:double, col4:double)
kind=adl
partition by (Date:datetime)
pathformat = (datetime_pattern("yyyy/MM/dd", Date))
dataformat=csv
(
h#'abfss://container#datalake_name.dfs.core.windows.net/folder1/subfolder;{key}'
)
with (includeHeaders = 'All')
I am new to Neo4j and Cypher query.My create query is like each Shop has 2 chillers which has 2 PLCs each which in turn has 2 sensors each.
The create is as below
Create(:SHOP{name:"Shop1"})-[:hasChiller]->(:CHILLER{name:"Chiller1"})
Create(:SHOP{name:"Shop1"})-[:hasChiller]->(:CHILLER{name:"Chiller2"})
Create(:SHOP{name:"Shop2"})-[:hasChiller]->(:CHILLER{name:"Chiller3"})
Create(:SHOP{name:"Shop2"})-[:hasChiller]->(:CHILLER{name:"Chiller4"})
Create(:CHILLER{name:"Chiller1"})-[:hasPLC]->(:PLC{name:"Plc1"})
Create(:CHILLER{name:"Chiller1"})-[:hasPLC]->(:PLC{name:"Plc2"})
Create(:CHILLER{name:"Chiller2"})-[:hasPLC]->(:PLC{name:"Plc3"})
Create(:CHILLER{name:"Chiller2"})-[:hasPLC]->(:PLC{name:"Plc4"})
Create(:CHILLER{name:"Chiller3"})-[:hasPLC]->(:PLC{name:"Plc5"})
Create(:CHILLER{name:"Chiller3"})-[:hasPLC]->(:PLC{name:"Plc6"})
Create(:CHILLER{name:"Chiller4"})-[:hasPLC]->(:PLC{name:"Plc7"})
Create(:CHILLER{name:"Chiller4"})-[:hasPLC]->(:PLC{name:"Plc8"})
Create(:PLC{name:"Plc1"})-[:hasSensor]->(:SENSOR{name:"Sensor1"})
Create(:PLC{name:"Plc1"})-[:hasSensor]->(:SENSOR{name:"Sensor2"})
Create(:PLC{name:"Plc2"})-[:hasSensor]->(:SENSOR{name:"Sensor3"})
Create(:PLC{name:"Plc2"})-[:hasSensor]->(:SENSOR{name:"Sensor4"})
Create(:PLC{name:"Plc3"})-[:hasSensor]->(:SENSOR{name:"Sensor5"})
Create(:PLC{name:"Plc3"})-[:hasSensor]->(:SENSOR{name:"Sensor6"})
Create(:PLC{name:"Plc4"})-[:hasSensor]->(:SENSOR{name:"Sensor7"})
Create(:PLC{name:"Plc4"})-[:hasSensor]->(:SENSOR{name:"Sensor8"})
Create(:PLC{name:"Plc5"})-[:hasSensor]->(:SENSOR{name:"Sensor9"})
Create(:PLC{name:"Plc5"})-[:hasSensor]->(:SENSOR{name:"Sensor10"})
Create(:PLC{name:"Plc6"})-[:hasSensor]->(:SENSOR{name:"Sensor11"})
Create(:PLC{name:"Plc6"})-[:hasSensor]->(:SENSOR{name:"Sensor12"})
Create(:PLC{name:"Plc7"})-[:hasSensor]->(:SENSOR{name:"Sensor13"})
Create(:PLC{name:"Plc7"})-[:hasSensor]->(:SENSOR{name:"Sensor14"})
Create(:PLC{name:"Plc8"})-[:hasSensor]->(:SENSOR{name:"Sensor15"})
Create(:PLC{name:"Plc8"})-[:hasSensor]->(:SENSOR{name:"Sensor16"})
However the Match to get the sensors under SHOP1
MATCH(s:SHOP{name:"Shop1"})-[:hasChiller]->(cc:CHILLER)-[:hasPLC]->(pp:PLC)-[:hasSensor]->(ss:SENSOR) return ss.name
returns nothing.Says no changes and no data.
I am trying this out on Neo4J sandbox environment.I did this based on the understanding i had using match clause in SQL SERVER GRAPH 2019 where this works.
Can anyone point out where i am going wrong?
You are improperly creating multiple instances of the "same" node. You should create each node once, and then use its bound variable name later on when you need to create relationships involving that node.
Delete all your data and follow this pattern instead (you have to fill in the "..." parts):
CREATE
(sh1:SHOP{name:"Shop1"}), (sh2:SHOP{name:"Shop1"}),
(c1:CHILLER{name:"Chiller1"}), (c2:CHILLER{name:"Chiller2"}),(c3:CHILLER{name:"Chiller3"}), (c4:CHILLER{name:"Chiller4"}),
(p1:PLC{name:"Plc1"}), ..., (p8:PLC{name:"Plc8"}),
(se1:SENSOR{name:"Sensor1"}), ..., (se16:SENSOR{name:"Sensor16"}),
(sh1)-[:hasChiller]->(c1), (sh1)-[:hasChiller]->(c2),
... // create remaining relationships using bound variable names for nodes
I have a requirement to setup CDC from Source(Oracle) to Target(BigQuery) using Goldengate.
I can have only option to filter data in replicat side based on Specific column name .
As per the below link :
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/using-oracle-goldengate-parameter-files.html#GUID-7F405A81-B2D1-4072-B254-DC2B0EC56FBA
I have setup the replicat like below
REPLICAT RPOC
TARGETDB LIBFILE libggjava.so SET property=dirprm/bqpoc.props
SOURCEDEFS /app/oracle/ogg_bigdata/dirdef/poc.def
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 500
MAP ARADMINPI.TPOC, TARGET PRD.TPOCFL,KEYCOLS(ID),WHERE (NAME= ?SOUVIKPOC);
===================================
export SOUVIKPOC='Smith'
But I am getting below error
2020-02-19 05:47:37 ERROR OGG-01157 Error in WHERE clause for ARADMINPI.TPOC.
=============================
Is there anything I am doing wrong here?
For Parameter Substitution to work, you'll need to enclose ?SOUVIKPOC in quotes, like this:
MAP ARADMINPI.TPOC, TARGET PRD.TPOCFL,KEYCOLS(ID),WHERE (NAME= '?SOUVIKPOC');
There should also be additional information about the failure earlier in the report file.
Another Example Using #GETENV
Another option is to use the #GETENV function instead of Parameter Substitution. Here, the MAP statement uses a FILTER clause instead of the WHERE clause:
MAP ARADMINPI.TPOC, TARGET PRD.TPOCFL,KEYCOLS(ID),
FILTER (#STREQ(NAME, #GETENV('OSVARIABLE', 'SOUVIKPOC')));
Unless you set the SOUVIKPOC environment variable prior to running GGSCI (and executing START MGR), you need to add a SETENV statement to your parameter file:
SETENV (SOUVIKPOC = 'Smith')
Putting it all together:
REPLICAT RPOC
TARGETDB LIBFILE libggjava.so SET property=dirprm/bqpoc.props
SOURCEDEFS /app/oracle/ogg_bigdata/dirdef/poc.def
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 500
SETENV (SOUVIKPOC = 'Smith')
MAP ARADMINPI.TPOC, TARGET PRD.TPOCFL,KEYCOLS(ID),
FILTER (#STREQ(NAME, #GETENV('OSVARIABLE', 'SOUVIKPOC')));