I tried to run a docker container application which accesses Cloud Datastore in the Cloud Shell, but the access was refused. I suspect that the Cloud Shell doesn't have the scope to access Cloud Datastore.
Is it possible to add an appropriate scope the Cloud Shell instance?
There was a bug in Cloud Shell credential handling where using newer versions of Python oauth2client package (either directly or indirectly) would fail with error like
File "/usr/local/lib/python2.7/dist-packages/oauth2client/contrib/gce.py", line 117, in _retrieve_info
self.service_account_email = info['email']
TypeError: string indices must be integers
This should be fixed in the newer image release. New sessions of Cloud Shell should not have this issue. Here is a working example of using Cloud Datastore API in a container, running in Cloud Shell:
$ cat Dockerfile
FROM python
RUN pip install gcloud
COPY test.py .
CMD ["python", "test.py"]
$ cat test.py
from gcloud import datastore
client = datastore.Client(project='your-project-id-23242')
query = datastore.Query(client, kind='EntityKind')
print(list(query.fetch()))
$ docker build -t test .
... docker output ...
$ docker run -ti test
[]
The example prints out just an empty list because I don't have any entities of "EntityKind" kind in my project's datastore, but you get the idea.
P.S. I work at Google.
Related
Problem statement: have Google Cloud Storage with some Buckets. Need to import data from such buckets into:
a local Jupyter instance running on my local computer
a Google Colab notebook
a JupyterLab notebook in Vertex AI (and/or AI Platform)
Any reference code to be able these cases would be appreciated.
Kind Regards
local Jupyter instance: First authenticate your local env using gcloud auth login then use gsutil to copy the content to local env.
# Authenticate with your account
!gcloud auth login --no-browser
# Copy from your bucket to local path (note -r is for recursive call)
!gsutil cp -r gs://BUCKET/DIR_PATH ./TARGET_DIR
Colab: First authenticate your Colab session to get access to the cloud APIs.Then you can use gsutil to copy the content to the local env.
# Authenticate with your account
from google.colab import auth as google_auth
google_auth.authenticate_user()
# Copy from your bucket to local path (note -r is for recursive call)
!gsutil cp -r gs://BUCKET/DIR_PATH ./TARGET_DIR
JupyterLab notebook in Vertex AI: Your env is already authenticated. Use gsutil to copy the content to local env.
# Copy from your bucket to local path (note -r is for recursive call)
!gsutil cp -r gs://BUCKET/DIR_PATH ./TARGET_DIR
You can also directly access the files in your Google Cloud Storage via Python using the Cloud Storage client libraries. You will need to authenticate your environment first as mentioned above.
# Imports the Google Cloud client library
from google.cloud import storage
# Instantiates a client
storage_client = storage.Client()
# The name for the new bucket
bucket_name = "my-new-bucket"
# Creates the new bucket
bucket = storage_client.create_bucket(bucket_name)
print(f"Bucket {bucket.name} created.")
I am using the Google AI platform which provides jupyterlab notebooks. I have 2 notebook instances set up to run R of which only one notebook now opens. The first notebook will not open regardless of the number of stops and resets I performed. The notebook overview can be seen in this image and circled is a difference (it is 'started'):
The only reason I can imagine for this difficulty is that I changed the machine type for the notebook where I decreased the number of CPUs from 4 to 2 and the RAM from 15 to 7.5. Now I cannot open it and it has a blank for where the environment should say R3.6. I would not mind deleting it and starting over if there was not nonbacked-up work on it.
What can be done to bring the notebook back to operation and if it cannot be done, how can I download it or extract some key files?
As it was commented before, there are two ways to inspect the Notebook Instance using Cloud Console:
GCP Console > AI Platform > Notebooks
GCP Console > Compute Engine > VM Instances. The name of the GCE VM Instance will be the same as the Notebook Instance name.
It looks like you were able to connect to your Notebook instance via SSH button. Additionally you can use gcloud command to connect to instances via SSH that you have permission to access by following:
gcloud compute ssh --project <PROJECT> --zone <ZONE> <INSTANCE>
After you connect, use the terminal to run commands to verify the status of your jupyter service and the service logs by running:
sudo service jupyter status
sudo journalctl -u jupyter.service --no-pager
You can restart the jupyter service to try to recover it:
sudo service jupyter restart
If you want to use other methods or third parties to create a SSH connection to your Notebook instance, you can follow this.
If you were not able to recover your jupyter service, you can copy your files from your VM by click the gear icon in the upper right of the SSH from the Browser window and select Download file.
As it was mentioned before, the gsutil cp command allows you to copy data between your local file system and the cloud, within the cloud, and between cloud storage providers. For example, to upload all files from the local directory to a bucket, you can run:
gsutil cp -r dir gs://my-bucket
Use the -r option to copy an entire directory tree
I am trying to run a single task within a DAG on a GCP cloud composer airflow instance and mark all other tasks in the dag both upstream and downstream as successful. However, the following airflow command seems to not be working for me on cloud composer.
Does anyone know what is wrong with the followinggcloud cli command?
dag_id: "airflow_monitoring" <br>
task_id: "echo1" <br>
execution_date: "2020-07-03" <br>
gcloud composer environments run my-composer --location us-centra1 \
-- "airflow_monitoring" "echo1" "2020-07-03"
Thanks for your help.
If you aim just to correctly compose the above mentioned gcloud command, triggering the specific DAG, then after fixing some typos and propagating Airflow CLI sub-command parameters, I got this works:
gcloud composer environments run my-composer --location=us-central1 \
--project=<project-id> trigger_dag -- airflow_monitoring --run_id=echo1 --exec_date="2020-07-03"
I would also encourage you to check out the full Airflow CLI sub-command list.
In case you expect to get some different functional result, then feel free to expand the initial question, adding more essential content.
I have to meteor application in local (admin and client). Applications run on different port 3000 and 3003. I want to use both app should use the same DB. export MONGO_URL=mobgodb://127.0.0.1:3001/meteor will be okay. But I would like to know any argument to pass with meteor command to setup environment variable to use the same DB.
If you are looking for a start script you could do the following:
In the root of your app, create a file called start.sh:
#!/usr/bin/env bash
MONGO_URL=mobgodb://127.0.0.1:3001/meteor meteor --port 3000
Then run chmod +x start.sh
You can then start your app just by typing ./start.sh
Environment : Hortonworks Sandbox HDP 2.2.4
Issue : Unable to run the hadoop commands present in the shell scripts as a root user. The oozie job is getting triggered as a root user, but when the hadoop fs or any mapreduce command is executed, then it runs as yarn user. As yarn, doesn’t have access to some of the file system , so the shell script is failing to execute. Let me know what changes I need to do , for making it run the hadoop commands as root user.
It is an expected behaviour to get Yarn in place whenever we are invoking shell actions in oozie. Yarn user only have the capabilities to run shell actions. One thing we can do is to give access permissions to Yarn on the file system.
This is more like a shell script question than an Oozie question. In theory, Oozie job runs as the user who submits the job. In a kerberos' env, the user is whoever signed in with keytab/password.
Once job is running on Hadoop cluster, in order to change the ownership of command, you can use "sudo" within your shell script. In your case, you may also want to make sure user "yarn" is allowed to sudo to the commands you want to execute.
Add below property to workflow:
HADOOP_USER_NAME=${wf:user()}