Accessing and compare properties of the stored edge in traversal - gremlin

I have following graph model:
I want to select user, who has performed action via the controller. The performed-by edge contains used_user_key property, that i want to use in order to select the called-by edge connected to the required user on the following condition: called-by.user_key == performed-by.used_user_key property.
I store performed-by in action_edge and trying to use stored value in has step.
Problem: has('user_key', select('action_edge').values('used_user_key')) yields random edge.
Question: How should i get/reference property from the stored edge in a has step?
GraphDB: JanusGraph 0.5.2
gremlinpython: 3.5.0
Python snippet for reproducing the issue:
user_a = g.addV('user').property('name', 'a').next()
user_b = g.addV('user').property('name', 'b').next()
user_c = g.addV('user').property('name', 'c').next()
controller = g.addV('controller').property('name', 'controller').next()
action = g.addV('action').property('name', 'action').next()
g.V(user_a).as_('to').V(controller).as_('from') \
.addE('called-by') \
.property('user_key', 'user_a') \
.to('to') \
.next()
g.V(user_b).as_('to').V(controller).as_('from') \
.addE('called-by') \
.property('user_key', 'user_b') \
.to('to') \
.next()
g.V(user_c).as_('to').V(controller).as_('from') \
.addE('called-by') \
.property('user_key', 'user_c') \
.to('to') \
.next()
g.V(controller).as_('to').V(action).as_('from') \
.addE('performed-by') \
.property('used_user_key', 'user_a') \
.to('to') \
.next()
# Works as expected!
user_perming_the_action = g.V(action).outE('performed-by').as_('action_edge').inV() \
.outE('called-by').has('user_key', 'user_a').inV() \
.next()
assert user_a.id == user_perming_the_action.id
# Selects random user - ignores all action_edge.used_user_key value
user_perming_the_action = g.V(action).outE('performed-by').as_('action_edge').inV() \
.outE('called-by').has('user_key', select('action_edge').values('used_user_key')).inV();
# Why it yield 3 instead of 1 edge?
assert user_perming_the_action.clone().count().next() == 3
# Returns random user
assert user_a.id == user_perming_the_action.clone().next().id
Thanks for you help in advance!

After some research, i have found following solution to the problem:
user_perming_the_action = g.V(action).outE('performed-by').as_('action_edge').inV() \
.outE('called-by').where(eq('action_edge')).by('user_key').by('used_user_key').inV() \
.next()
assert user_a.id == user_perming_the_action.id
I am comparing edges with where on properties with different names by using two by modulators.

Related

How can I add a user to a protected branch?

I would like to configure my gitlab project so that every maintainer can merge (after review) but nobody can push on master; only a bot (for release).
I'm using terraform to configure my gitlab, with something like this:
resource "gitlab_branch_protection" "BranchProtect" {
project = local.project_id
branch = "master"
push_access_level = "no one"
merge_access_level = "maintainer"
}
But with have a "premium" version and the terraform provider do not allow to add a user (goto: https://github.com/gitlabhq/terraform-provider-gitlab/issues/165 ).
So, what I like to do is doing some http request on the API to add the specific user.
So I'm doing it like this:
get the actual protection
delete the actual configuration
update the retrieved configuration with what I want
push the new configuration
BTW: I've not found how to just update the configuration... https://docs.gitlab.com/ee/api/protected_branches.html
TMP_FILE=$(mktemp)
http GET \
$GITLAB_URL/api/v4/projects/$pid/protected_branches \
PRIVATE-TOKEN:$GITLAB_TOKEN \
name=$BRANCH_NAME \
| \
jq \
--arg uid $USER_ID \
'.[0] | .push_access_levels |= . + [{user_id: ($uid | tonumber)}]' \
> $TMP_FILE
http DELETE \
"$GITLAB_URL/api/v4/projects/$pid/protected_branches/$BRANCH_NAME" \
PRIVATE-TOKEN:$GITLAB_TOKEN
http --verbose POST \
"$GITLAB_URL/api/v4/projects/$pid/protected_branches" \
PRIVATE-TOKEN:$GITLAB_TOKEN \
< $TMP_FILE
But my problem is that the resulting configuration is not what I expect, I've got something like this:
"push_access_levels": [
{
"access_level": 40,
"access_level_description": "Maintainers",
"group_id": null,
"user_id": null
}
],
How can I just update the branch protection to add a simple user ?
Ok like they say: RTFM !
But you need to delete the rule before adding the new configuration.
http \
DELETE \
"$GITLAB_URL/api/v4/projects/$pid/protected_branches/$BRANCH_NAME" \
PRIVATE-TOKEN:$GITLAB_TOKEN \
http \
POST \
$GITLAB_URL/api/v4/projects/$pid/protected_branches \
PRIVATE-TOKEN:$GITLAB_TOKEN \
name==${BRANCH_NAME} \
push_access_level==0 \
merge_access_level==40 \
unprotect_access_level==40 \
allowed_to_push[][user_id]==$USER_ID \

Finetuning BERT on Custom data using Colab

I am running run_lm_finetuning on colab, to fine tune CamemBERT on custom vocabulary.
I am using the following parameters:
!python run_lm_finetuning.py \
--output_dir Skander/ \
--model_type camembert\
--model_name_or_path camembert-base \
--do_train \
--train_data_file="Text.txt" \
--line_by_line\
--mlm\
--per_gpu_train_batch_size=32 \
--num_train_epochs=3 \
However, I am getting the following error:
tcmalloc: large alloc 1264730112 bytes == 0xe87fe000 # 0x7f9828a8f1e7 0x5ad4cb 0x4bb356 0x5bd993 0x50a8af 0x50c5b9 0x508245 0x509642 0x595311 0x54a6ff 0x551b81 0x5aa6ec 0x50abb3 0x50d390 0x508245 0x50a080 0x50aa7d 0x50d390 0x508245 0x50a080 0x50aa7d 0x50c5b9 0x508245 0x50b403 0x635222 0x6352d7 0x638a8f 0x639631 0x4b0f40 0x7f982868cb97 0x5b2fda
^C
Anyone has an idea about this error?

Ignore one item in for ...loop Robot framework

Loop Delete user
${fruits} create list locnx huongpt1 xuanhh lynch
:FOR ${fruit} IN #{fruits}
\ Log ${fruit}
\ go to http://sssss.info:8080/secure/admin/user/UserBrowser.jspa
\ input text id= user-filter-userSearchFilter ${fruit}
\ Click button id=user-filter-submit
\ Wait Until Page Contains ${fruit} 3
My ideal:
If user exists in search result then click "Delete" button.
If user doesn't exist in search result then ignore "locnx" and continue with "huongpt1"
How I can do it ..Please help me
Here is one way to do it:
Loop Delete user
${fruits} create list locnx huongpt1 xuanhh lynch
:FOR ${fruit} IN #{fruits}
\ Log ${fruit}
\ go to http://sssss.info:8080/secure/admin/user/UserBrowser.jspa
\ input text id= user-filter-userSearchFilter ${fruit}
\ Click button id=user-filter-submit
\ ${status} ${value} = Run Keyword And Return Status Wait Until Page Contains ${fruit} 3
\ Run Keyword If '${status}' == 'PASS' keyword_to_delete

Is it possible to inspect all tables in a BigQuery dataset with one dlpJob?

I'm using Google Cloud DLP to inspect sensitive data in BigQuery. I wonder is it possible to inspect all tables within a dataset with one dlpJob? If so, how should I set the configs?
I tried to omit the BQ tableId field in config. But it will return http 400 error "table_id must be set". Does it mean that with one dlpJob, only one table can be inspected, and to scan multiple tables we need multiple dlpJobs? Or is there a way to scan multiple tables within the same dataset with some regex tricks?
At the moment, one job just scans one table. The team is working on that feature - in the meantime you can manually create jobs with a rough shell script like what I've put below which combines gcloud and the rest calls to the dlp api. You could probably do something a lot smoother with cloud functions.
Prerequisites:
1. Install gcloud. https://cloud.google.com/sdk/install
2. Run this script with the following arguments:
3.
1. The project_id to scan bigquery tables of.
2. The dataset id for the output table to store findings to.
3. The table id for the output table to store findings to.
4. A number that represents the percentage of rows to scan.
# Example:
# ./inspect_all_bq_tables.sh dlapi-test findings_daataset
# Reports a status of execution message to the log file and serial port
function report() {
local tag="${1}"
local message="${2}"
local timestamp="$(date +%s)000"
echo "${timestamp} - ${message}"
}
readonly -f report
# report_status_update
#
# Reports a status of execution message to the log file and serial port
function report_status_update() {
report "${MSGTAG_STATUS_UPDATE}" "STATUS=${1}"
}
readonly -f report_status_update
# create_job
#
# Creates a single dlp job for a given bigquery table.
function create_dlp_job {
local dataset_id="$1"
local table_id="$2"
local create_job_response=$(curl -s -H \
"Authorization: Bearer $(gcloud auth print-access-token)" \
-H "X-Goog-User-Project: $PROJECT_ID" \
-H "Content-Type: application/json" \
"$API_PATH/v2/projects/$PROJECT_ID/dlpJobs" \
--data '
{
"inspectJob":{
"storageConfig":{
"bigQueryOptions":{
"tableReference":{
"projectId":"'$PROJECT_ID'",
"datasetId":"'$dataset_id'",
"tableId":"'$table_id'"
},
"rowsLimitPercent": "'$PERCENTAGE'"
},
},
"inspectConfig":{
"infoTypes":[
{
"name":"ALL_BASIC"
}
],
"includeQuote":true,
"minLikelihood":"LIKELY"
},
"actions":[
{
"saveFindings":{
"outputConfig":{
"table":{
"projectId":"'$PROJECT_ID'",
"datasetId":"'$FINDINGS_DATASET_ID'",
"tableId":"'$FINDINGS_TABLE_ID'"
},
"outputSchema": "BASIC_COLUMNS"
}
}
},
{
"publishFindingsToCloudDataCatalog": {}
}
]
}
}')
if [[ $create_job_response != *"dlpJobs"* ]]; then
report_status_update "Error creating dlp job: $create_job_response"
exit 1
fi
local new_dlpjob_name=$(echo "$create_job_response" \
head -5 | grep -Po '"name": *\K"[^"]*"' | tr -d '"' | head -1)
report_status_update "DLP New Job: $new_dlpjob_name"
}
readonly -f create_dlp_job
# List the datasets for a given project. Once we have these we can list the
# tables within each one.
function create_jobs() {
# The grep pulls the dataset id. The td removes the quotation marks.
local list_datasets_response=$(curl -s -H \
"Authorization: Bearer $(gcloud auth print-access-token)" -H \
"Content-Type: application/json" \
"$BIGQUERY_PATH/projects/$PROJECT_ID/datasets")
if [[ $list_datasets_response != *"kind"* ]]; then
report_status_update "Error listing bigquery datasets: $list_datasets_response"
exit 1
fi
local dataset_ids=$(echo $list_datasets_response \
| grep -Po '"datasetId": *\K"[^"]*"' | tr -d '"')
# Each row will look like "datasetId", with the quotation marks
for dataset_id in ${dataset_ids}; do
report_status_update "Looking up tables for dataset $dataset_id"
local list_tables_response=$(curl -s -H \
"Authorization: Bearer $(gcloud auth print-access-token)" -H \
"Content-Type: application/json" \
"$BIGQUERY_PATH/projects/$PROJECT_ID/datasets/$dataset_id/tables")
if [[ $list_tables_response != *"kind"* ]]; then
report_status_update "Error listing bigquery tables: $list_tables_response"
exit 1
fi
local table_ids=$(echo "$list_tables_response" \
| grep -Po '"tableId": *\K"[^"]*"' | tr -d '"')
for table_id in ${table_ids}; do
report_status_update "Creating DLP job to inspect table $table_id"
create_dlp_job "$dataset_id" "$table_id"
done
done
}
readonly -f create_jobs
PROJECT_ID=$1
FINDINGS_DATASET_ID=$2
FINDINGS_TABLE_ID=$3
PERCENTAGE=$4
API_PATH="https://dlp.googleapis.com"
BIGQUERY_PATH="https://www.googleapis.com/bigquery/v2"
# Main
create_jobs

Problem in using shell for loop inside gnu make?

consider the below make file
all:
#for x in y z; \
do \
for a in b c; \
do \
echo $$x$$a >> log_$$x; \
done; \
done
While executing this make file, two file got created log_y and log_z. log_y is having data "yb" and "yc". similarly log_z is having data"zb" and "zc".
Actually I want to create four files(log_y_b, log_y_c, log_z_b, log_z_c). For this i have modified the above make file as,
all:
#for x in y z; \
do \
for a in b c; \
do \
echo $$x$$a >> log_$$x_$$a; \
done; \
done
But its creating only one file log_. What should i have to do to create four files.
Perhaps put braces around the variable names: it works on my system.
all:
#for x in y z; \
do \
for a in b c; \
do \
echo $$x$$a >> log_$${x}_$${a}; \
done; \
done
You can also use foreach:
all:
#$(foreach x,y z,$(foreach a,b c,echo $(x)$(a) >> log_$(x)_$(a);))
log_$$x_$$a in the Makefile turns into log_$x_$a for the shell which is equivalent to log_${x_}${a}. The variable $x_ is undefined, however, so the shell substitutes it by the empty string.
Solution: Properly write the $x variable with curly braces around the name (${variablename}), i.e. for consistency's sake write log_${x}_${a} (or in Makefile style: log_$${x}_$${a}).

Resources