Merge strategy OURS has resulted in fast forward merge jgit - jgit

I am trying to pull from remote branch to my local. If there are merge conflicts I have set the merge strategy to OURS i.e it should take only my local changes.
But the merge result shows me that "Merge strategy OURS has resulted in fast forward merge" and it is taking the remote changes contradicting to the strategy set(OURS). How can this be resolved in JGit?
PullCommand pullCommand = git.pull().setRemoteBranchName(file.getProperty("remoteBranchName"));
System.out.println("repository state"+pullCommand.getRepository().getRepositoryState());
//Pull the latest changes if the repository is in safe state
if(RepositoryState.SAFE == pullCommand.getRepository().getRepositoryState()){
//Merge strategy to be used if there are merge conflicts
//OURS - to get the local changes, THEIRS- to get remote changes
if("THEIRS".equalsIgnoreCase(file.getProperty("mergeStrategy"))){
pullCommand.setStrategy(MergeStrategy.THEIRS);
}
else if("OURS".equalsIgnoreCase(file.getProperty("mergeStrategy"))){
pullCommand.setStrategy(MergeStrategy.OURS);
}
//Executes the Pull command
result = pullCommand.call();
MergeResult result2 = result.getMergeResult();

Related

Emergency unlocking of resources in database

We are having problem on a live production system.
One of the nodes stopped working properly (because of problems with network file system on which is it hosted) and that happened while the channel staging process was in progress.
Because of that really bad timing, the staging process remained unfinished and all locked resources remained that way which prevented editing products or catalogs on live system.
1st solution we have tried is restarting servers node by node, that didn't help.
2nd solution we tried executing SQLs mentioned in this support article:
https://support.intershop.com/kb/index.php/Display/2350K6
The exact SQLs we have executed are below, first one is for deleting from RESOURCELOCK table:
DELETE FROM RESOURCELOCK rl WHERE rl.LOCKID IN (SELECT
resourcelock.lockid
FROM
isresource ,
domaininformation resourcedomain,
process,
basiccredentials ,
domaininformation userdomain,
resourcelock ,
isresource_av
WHERE (
(isresource.domainid = resourcedomain.domainid)
AND (isresource.resourceownerid = process.uuid)
AND (resourcelock.lockid = isresource.uuid)
AND (process.userid = basiccredentials.basicprofileid(+))
AND (basiccredentials.domainid = userdomain.domainid(+))
AND (isresource_av.ownerid(+) = isresource.uuid)
AND (isresource.resourceownerid is not null)
AND (isresource_av.name(+) = 'locknestinglevel')
AND (process.name = 'StagingProcess')
));
And another one for ISRESOURCE table:
UPDATE isresource
SET
resourceownerid=null,
lockexpirationdate=null,
lockcreationdate=null,
lockingthreadid=null
WHERE
RESOURCEOWNERID='QigK85q6scAAAAF9Pf9fHEwf'; //UUID of StagingProcess
Now this has somewhat helped as it allowed for single products to be staged, but here are still two problems remaining:
Products can't be manually locked on live system for editing, when lock icon is clicked the page refreshes but it looks like it is still unlocked, but records are created for each product which is clicked on in ISRESOURCE table altough they are incomplete (the is no RESOURCEOWNERID, lock creation date, or lock expiration date), this can be seen below:
Also processes are tried to be created for product locking but they are all failing or running without end date as can be seen here:
Now for the second problem:
The channel staging cannot be started and it fails with message:
ERROR - Could not lock resources for process 'StagingProcess': Error finding resource lock with lockid: .0kK_SlyFFUAAAFlhGJujvESnull
That resource is MARKETING_Promotion resource:
Both problems started occuring after running above mentioned SQLs and it seems they are related, any advice on how to resolve this situation would be helpfull.
The first SQL that I posted shouldn't have been run:
DELETE FROM RESOURCELOCK rl WHERE rl.LOCKID IN....
The fix was to restore deleted resource locks and just set the lock fields in ISRESOURCE table to null with the second SQL:
UPDATE isresource
SET
resourceownerid=null,
lockexpirationdate=null,
lockcreationdate=null,
lockingthreadid=null
WHERE
RESOURCEOWNERID='QigK85q6scAAAAF9Pf9fHEwf'; //UUID of StagingProcess

Is it possible to create dynamic jobs with Dagster?

Consider this example - you need to load table1 from source database, do some generic transformations (like convert time zones for timestamped columns) and write resulting data into Snowflake. This is an easy one and can be implemented using 3 dagster ops.
Now, imagine you need to do the same thing but with 100s of tables. How would you do it with dagster? Do you literally need to create 100 jobs/graphs? Or can you create one job, that will be executed 100 times? Can you throttle how many of these jobs will run at the same time?
You have a two main options for doing this:
Use a single job with Dynamic Outputs:
With this setup, all of your ETLs would happen in a single job. You would have an initial op that would yield a DynamicOutput for each table name that you wanted to do this process for, and feed that into a set of ops (probably organized into a graph) that would be run on each individual DynamicOutput.
Depending on what executor you're using, it's possible to limit the overall step concurrency (for example, the default multiprocess_executor supports this option).
Create a configurable job (I think this is more likely what you want)
from dagster import job, op, graph
import pandas as pd
#op(config_schema={"table_name": str})
def extract_table(context) -> pd.DataFrame:
table_name = context.op_config["table_name"]
# do some load...
return pd.DataFrame()
#op
def transform_table(table: pd.DataFrame) -> pd.DataFrame:
# do some transform...
return table
#op(config_schema={"table_name": str})
def load_table(context, table: pd.DataFrame):
table_name = context.op_config["table_name"]
# load to snowflake...
#job
def configurable_etl():
load_table(transform_table(extract_table()))
# this is what the configuration would look like to extract from table
# src_foo and load into table dest_foo
configurable_etl.execute_in_process(
run_config={
"ops": {
"extract_table": {"config": {"table_name": "src_foo"}},
"load_table": {"config": {"table_name": "dest_foo"}},
}
}
)
Here, you create a job that can be pointed at a source table and a destination table by giving the relevant ops a config schema. Depending on those config options, (which are provided when you create a run through the run config), your job will operate on different source / destination tables.
The example shows explicitly running this job using python APIs, but if you're running it from Dagit, you'll also be able to input the yaml version of this config there. If you want to simplify the config schema (as it's pretty nested as shown), you can always create a Config Mapping to make the interface nicer :)
From here, you can limit run concurrency by supplying a unique tag to your job, and using a QueuedRunCoordinator to limit the maximum number of concurrent runs for that tag.

Invalidate/prevent memoize with plone.memoize.ram

I've and Zope utility with a method that perform network processes.
As the result of the is valid for a while, I'm using plone.memoize.ram to cache the result.
MyClass(object):
#cache(cache_key)
def do_auth(self, adapter, data):
# performing expensive network process here
...and the cache function:
def cache_key(method, utility, data):
return time() // 60 * 60))
But I want to prevent the memoization to take place when the do_auth call returns empty results (or raise network errors).
Looking at the plone.memoize code it seems I need to raise ram.DontCache() exception, but before doing this I need a way to investigate the old cached value.
How can I get the cached data from the cache storage?
I put this together from several code I wrote...
It's not tested but may help you.
You may access the cached data using the ICacheChooser utility.
It's call method needs the dotted name to the function you cached, in your case itself
key = '{0}.{1}'.format(__name__, method.__name__)
cache = getUtility(ICacheChooser)(key)
storage = cache.ramcache._getStorage()._data
cached_infos = storage.get(key)
In cached_infos there should be all infos you need.

File locks in R

The short version
How would I go about blocking the access to a file until a specific function that both involves read and write processes to that very file has returned?
The use case
I often want to create some sort of central registry and there might be more than one R process involved in reading from and writing to that registry (in kind of a "poor man's parallelization" setting where different processes run independently from each other except with respect to the registry access).
I would not like to depend on any DBMS such as SQLite, PostgreSQL, MongoDB etc. early on in the devel process. And even though I later might use a DBMS, a filesystem-based solution might still be a handy fallback option. Thus I'm curious how I could realize it with base R functionality (at best).
I'm aware that having a lot of reads and writes to the file system in a parallel setting is not very efficient compared to DBMS solutions.
I'm running on MS Windows 8.1 (64 Bit)
What I'd like to get a deeper understanding of
What actually exactly happens when two or more R processes try to write to or read from a file at the same time? Does the OS figure out the "accesss order" automatically and does the process that "came in second" wait or does it trigger an error as the file access might is blocked by the first process? How could I prevent the second process from returning with an error but instead "just wait" until it's his turn?
Shared workspace of processes
Besides the rredis Package: are there any other options for shared memory on MS Windows?
Illustration
Path to registry file:
path_registry <- file.path(tempdir(), "registry.rdata")
Example function that registers events:
registerEvent <- function(
id=gsub("-| |:", "", Sys.time()),
values,
path_registry
) {
if (!file.exists(path_registry)) {
registry <- new.env()
save(registry, file=path_registry)
} else {
load(path_registry)
}
message("Simulated additional runtime between reading and writing (5 seconds)")
Sys.sleep(5)
if (!exists(id, envir=registry, inherits=FALSE)) {
assign(id, values, registry)
save(registry, file=path_registry)
message(sprintf("Registering with ID %s", id))
out <- TRUE
} else {
message(sprintf("ID %s already registered", id))
out <- FALSE
}
out
}
Example content that is registered:
x <- new.env()
x$a <- TRUE
x$b <- letters[1:5]
Note that the content usually is "nested", i.e. RDBMS would not be really "useful" anyway or at least would involve some normalization steps before writing to the DB. That's why I prefer environments (unique variable IDs and pass-by-reference is possible) over lists and, if one does make the step to use a true DBMS, I would rather turn NoSQL approaches such as MongoDB.
Registration cycle:
The actual calls might be spread over different processes, so there is a possibility of concurrent access atempts.
I want to have other processes/calls "wait" until a registerEvent read-write cycle is finished before doing their read-write cycle (without triggering errors).
registerEvent(values=list(x_1=x, x_2=x), path_registry=path_registry)
registerEvent(values=list(x_1=x, x_2=x), path_registry=path_registry)
registerEvent(id="abcd", values=list(x_1=x, x_2=x),
path_registry=path_registry)
registerEvent(id="abcd", values=list(x_1=x, x_2=x),
path_registry=path_registry)
Check registry content:
load(path_registry)
ls(registry)
See filelock R package, available since 2018. It is cross-platform. I am using it on Windows and have not found a single problem.
Make sure to read the documentation.
?filelock::lock
Although the docs suggest to leave the lock file, I have had no problems removing it on function exit in a multi-process environment:
on.exit({filelock::unlock(lock); file.remove(path.lock)})

R and RJDBC: Using dbSendUpdate results in ORA-01000: maximum open cursors exceeded

I have encountered an error when trying to insert thousands of rows with R/RJDBC and the dbSendUpdate command on an Oracle database.
Problem can be reproduced by creating a test table with
CREATE TABLE mytest (ID NUMBER(10) not null);
and then executing the following R script
library(RJDBC)
drv<-JDBC("oracle.jdbc.OracleDriver","ojdbc-11.1.0.7.0.jar") # place your JDBC driver here
conn <- dbConnect(drv, "jdbc:oracle:thin:#MYSERVICE", "myuser", "mypasswd") # place your connection details here
for (i in 1:10000) {
dbSendUpdate(conn,"INSERT INTO mytest VALUES (?)",i))
}
Searching the Internet provided the information, that one should close result cursors, which is obvious (e.g. see java.sql.SQLException: - ORA-01000: maximum open cursors exceeded or Unable to resolve error - java.sql.SQLException: ORA-01000: maximum open cursors exceeded).
But the help file for ??dbSendUpdate claims for not using result cursors at all:
.. that dbSendUpdate is used with DBML queries and thus doesn't return any result set.
Therefore this behavior doesn't make much sense to me :-(
Can anybody help?
Thanks, a lot!
PS: Found something interessting in the RJDBC documentation http://www.rforge.net/RJDBC/
Note that the life time of a connection, result set, driver etc. is determined by the lifetime of the corresponding R object. Once the R handle goes out of scope (or if removed explicitly by rm) and is garbage-collected in R, the corresponding connection or result set is closed and released. This is important for databases that have limited resources (like Oracle) - you may need to add gc() by hand to force garbage collection if there could be many open objects. The only exception are drivers which stay registered in the JDBC even after the corresponding R object is released as there is currently no way to unload a JDBC driver (in RJDBC).
But again, even inserting gc() within the loop will produce the same beahvoir.
Finally we realized this being a bug in package RJDBC. If you are willing to patch RJDBC you can use the patched sources available at https://github.com/Starfox899/RJDBC (as long as they are not imported into the package).

Resources