Schedule Rscript in a server - r

Currently, I am scheduling a daily bat from my laptop with TaskScheduler on windows and I would like to do it automatically in a server. Actually, the situation is:
I read some SQLite DB that I have stored in local.
Once I read them I do a webscraping based on some information.
I store this new information in the DB mentioned above.
I've read that is possible do it with Amazon EC2 (REF: http://www.louisaslett.com/RStudio_AMI/), but it doesn't mention anything related with SQLite local db.
Could you give me some recommendations about how to manage and which tool could be the best approach (Azure, AWS, Google Bigquery)

You could build a Cloud Run task to do this. Then schedule it using cloud scheduler.
https://cloud.google.com/run/docs/triggering/using-scheduler
You can find samples how to build a cloud run task in the docs:
https://cloud.google.com/run/docs
To write to big query, you can use the bigquery api client libraries: https://cloud.google.com/bigquery/docs/reference/libraries

Related

How to export data from Firebase Firestore emulator to actual Firebase Firestore database

Scenario:
I am working on a POC locally where I am using Firestore as my database. As it is my local setup I am using Firestore Emulator. Now my POC is successful and I want to move local database from emulator to actual Firestore.
Query:
Is it possible to achieve what I am trying to do?
So far I am not able to find any relevant content on internet around this. I did find couple of examples where there is demonstration of exporting data from Firestore and importing to local emulator but I was not able to find the vice-versa option!
Firebase does not provide any sort of tool or service to do this. Your easiest alternative will be to write a program to query the data out of the emulator and write it into your cloud hosted instance. You might find the Firebase Admin SDK helpful for writing to the cloud in a program that you run locally.

How can I use Firebase to query an external API and store data in Firestore?

I'm teaching myself to use Flutter and I'm making an app that queries The Movie Database API. Currently, I'm having the client query the API on launch but I'm thinking this is not the most efficient way of doing it, and I would rather have the client query a backend service like Firebase to get the same data.
I would appreciate some guidance into where to start in order to setup a periodical process to query the API and use the results as entries into a Firestore DB. I've looked online but I might be using suboptimal keywords since I haven't found a good tutorial or example for this.
Thanks.
You can use Firebase Cloud Functions to build code that runs on Firebase servers to fill your Firebase database, but you can only make HTTP requests to non-Google addresses if you use a paid plan.
https://firebase.googleblog.com/2017/03/how-to-schedule-cron-jobs-with-cloud.html explains how to invoke periodic tasks with Cloud Functions. It utilizes Google AppEngine for that because Cloud Functions doesn't provide that out of the box.

Need python files stored in Google Database to compile in Google Cloud Engine and return data to an IOS App

My Current Plan:
I'm currently creating an IOS App that will access/change java/python files that are stored in the Google Cloud Storage. Once confirmed the App will talk with App Engine that will have a Compute Engine VM receive files and compile them. Once compiled have the result returned back to the IOS App
Is there any better or easier method to achieve this task? Should I use firebase or Google Cloud Functions? Would it be any help
Currently, I'm lost how to design and have requests sent between many platforms.
It would also depend on what type of data processing you are doing to the files in Cloud Storage. Ideally you would want to avoid as many "hops" between services as possible. You could do everything via Cloud Functions and listen on GCS Triggers. You can think of Cloud Functions as a sudo App Engine Backend to use for quick request handling.
Use Cloud Functions to respond to events from Cloud Storage or Firebase Storage to process files immediately after upload
If you are already using Firebase, it would be better to stay within their ecosystem as much as possible. If you are doing bigger or more intensive data processing you might want to look at different options.
With more information and current pain points, we may be able to offer more insight.

Post to Azure Cosmos Db from NiFi

I created Azure CosmosDb database and container for my documents.
I use NiFi as a main data ingestion tool and want to feed my container with documents from NiFi flow files.
Can anybody please share a way to post flowfile content to Azure Cosmos Db from NiFi?
Thanks in advance
UPDATE(2019.05.26):
In the end I used Python script and called it from NiFi to post messages. I passed a message as a parameter. The reason I chose python is because it has some examples on official Microsoft site with all the required connection settings and libraries, so it was easy to connect to Cosmos.
I tried Mongo component, but couldn't connect to Azure (security config didn't work), didn't really go too far with it as Python script worked just fine.
Azure CosmosDB exposes MongoDB API so you can use the following MongoDB processors which are available in NiFi to read/query/write to & from Azure CosmosDB using Apache NiFi.
DeleteMongo
GetMongo
PutMongo
PutMongoRecord
RunMonogAggregation
Useful Links
https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb-introduction
https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb-feature-support
Valeria. According to the components list supported by Apache Nifi related to Azure, you could only get Azure Blob Storage, Queue Storage, Event Hub etc,not including Cosmos DB.
So,I suggest you using PutAzureBlobStorage to feed azure blob container with documents from NiFi flow files. Then please create a copy activity pipeline in Azure Data Factory to transfer data from Azure Blob Storage into Azure Cosmos DB.

How to establish a connection to DynamoDB using python using boto3

I am bit new to AWS and DynamoDB.
My aim is to embed a small piece of code.
The problem I am facing is how to make a connection in python code. I made a connection using AWS cli and then entering access ID and key.
But how to do it in my code as i wish to deploy my code on other systems.
Thanks in advance !!
First of all read documentation for boto3 dynamo, it's pretty simple:
http://boto3.readthedocs.io/en/latest/reference/services/dynamodb.html
If you want to provide access keys while connecting to dynamo, you can do the following:
client = boto3.client('dynamodb',aws_access_key_id='yyyy', aws_secret_access_key='xxxx', region_name='***')
But, remember, it is against best practices from security perspective to store such keys within the code.
For best security efforts use IAM roles.
boto3 driver will automatically consume IAM role if it is attached to the instance.
Link to the docs: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
Also, if IAM roles is to complicated, you can install and aws-cli and run aws configure on your server, and boto3 will use the key from here (less secure than a previous approach).
After implementing one of the options, you can connect to DynamoDB without the keys from code:
client = boto3.client('dynamodb', region_name='***')

Resources