Refer to wasb files from oozie command line - oozie

I'm trying to debug an error I'm getting while using Oozie on HDInsight and have found a tip suggesting issuing:
oozie validate workflow.xml
from the command-line might help. Unfortunately I don't know how to reference workflow.xml if that file is stored on Azure storage (wasb). I've tried the following:
oozie validate wasb://container#storageaccount.blob.core.windows.net/folder/workflow.xml
oozie validate /folder/workflow.xml
oozie validate folder/workflow.xml
but they all fail with "Error: File does not exists"
How do I refer to a wasb file from the Hadoop command-line?
Thanks
JT

We can't able to validate workflow from blob storage. we can validate our workflow from our local drive.
For eg :
oozie validate c:\ooziesample\workflow.xml
result : valid workflow
it works for me.

Related

How to solve the error `org.rocksdb.RocksDBException` in NebulaGraph Exchange?

The following error message is displayed when the SST is generated in NebulaGraph Exchange:
org.rocksdb.RocksDBException: While open a file for appending: /path/sst/1-xxx.sst: No such file or directory
You need to check the following in NebulaGraph Exchange:
Check if /path exists. If not, or if the path is set incorrectly, create or correct it.
Check if Spark's current user on each machine has the operation permission on /path. If not, grant the permission.

Using R and paws: How to set credentials using profile in config file?

I use SSO and a profile as defined in ~/.aws/config (MacOS) to access AWS services, for instance:
aws s3 ls --profile myprofilename
I would like to access AWS services from within R, using the paws() package. In order to do this, I need to set my credentials in the R code. I want to do this through accessing the profile in the ~/.aws/config file (as opposed to listing access keys in the code), but I haven't been able to figure out how to do this.
I looked at the extensive documentation here, but it doesn't seem to cover my use case.
The best I've been able to come up with is:
x = s3(config = list(credentials = list(profile = "myprofilename")))
x$list_objects()
... which throws an error: "Error in f(): No credentials provided", suggesting that the first line of code above does not connect to my profile as stored in ~/.aws/config.
An alternative is to generate a user/key with programmatic access to your S3 data. Then, assuming that ~/.aws/env contains the values of the generated key:
AWS_ACCESS_KEY_ID=abc
AWS_SECRET_ACCESS_KEY=123
AWS_REGION=us-east-1
insert the following line at the beginning of your file:
readRenviron("~/.aws/env")
This AWS blog provides details about how to get the temporary credentials for programatic access. If you can get the credentials and set the appropriate environment variables, then the code should work fine without the profile name.
Or You can also try the following if you can get temporary credentials using aws cli
Check if you can generate temporary credentials
aws sts assume-role --role-arn <value> --role-session-name <some-meaningful-session-name> --profile myprofilename
If you can execute the above successfully, then you can use this method to automate the process of generating credentials before your code runs.
Put the above command in a bash script get-temp-credentials.sh and generate a JSON containing the temporary credentials as per the documentation.
Add a new profile programmatic-access in the ~/.aws/config
[profile programmatic-access]
credential_process = "/path/to/get-temp-credentials.sh"
Finally update the code to use the profile name as programmatic-access
If you have AWS cli credentials set up as a bash profile eg. ~/.aws/config:
[profile myprof]
region=eu-west-2
output=json
.. and credentials eg. ~/.aws/credentials:
[myprof]
aws_access_key_id = XXX
aws_secret_access_key = xxx
.. paws will use these if you add a line to ~/.Renviron:
AWS_PROFILE=myprof

How to copy files from local to hdfs using oozie

I am trying to copy files from my edge node to HDFS using oozie. Many suggested to setup password less ssh to get this done.
Iam unable to login to oozie user as it is a service user.
Is there any other way other than password less ssh.
Thanks in advance.
Other than password less ssh there are two more options :
1. My preferred option : Use JSch java library and create a java application which will accept a shell script to be executed as argument. Using the JSch , it will perform ssh on the configured edge node and execute the shell script on the edge node. In the jsch, you can configure, the edgenode username and password. Use 'JCEKS' file to store the password.
Then add a Java Action in Oozie to run the java application created using JSch.
2. Use "/usr/bin/expect" library to create a shell script, which will perform ssh on edgenode and then run the configured shell script. More details are here Use expect in bash script to provide password to SSH command

oozie to load file with passwords from HDFS

I have a file with DB connection properties (including passwords) in HDFS that needs to be accessible to all oozie jobs.
I am looking for a strategy to use this file using oozie actions in order to connect to DB.
I wonder if creating hive table to load that file and query them via oozie hive action is good strategy to consider.

Where is the Alfresco audit log?

curl -u id:pw "http://localhost:8080/alfresco/service/api/audit/query/testapp?verbose=true&limit=200&forward=false"
Where is the Alfresco audit log that could be run by the above command actually stored?
I thought it was somewhere in the database but couldn't find it.
As mentioned in http://wiki.alfresco.com/wiki/Auditing_(from_V3.4):
alf_audit_model: Audit configuration files are recorded here.
alf_audit_application: An entry for each logical application. There may be several audit applications defined in a single audit model.
alf_audit_entry: Each call to AuditComponent.recordAuditValues will result in an entry here. There is a reference to a property.

Resources