I need to bulk import says 100 records into Cosmos DB.
I found dt.exe, that doesn't help. it throws error when importing csv into cosmos db with table api.
I'm not able to find any reliable way to automate this process.
The command-line Azure Cosmos DB Data Migration tool (dt.exe) can be
used to import your existing Azure Table storage data to a Table API
GA account, or migrate data from a Table API (preview) account into a
Table API GA account. Other sources are not currently supported. The
UI based Data Migration tool (dtui.exe) is not currently supported for
Table API accounts.
According to the above official statement, it seems that other sources(e.g csv file) are not supported to be migrated into Azure Table API account. You could adopt a workaround: Read csv file in the program then import data into Azure Table Storage.
Please refer to the sample python code which I did in this thread.
from azure.cosmosdb.table.tableservice import TableService
from azure.cosmosdb.table.models import Entity
import csv
import sys
import codecs
table_service = TableService(connection_string='***')
reload(sys)
sys.setdefaultencoding('utf-8')
filename = "E:/jay.csv"
with codecs.open(filename, 'rb', encoding="utf-8") as f_input:
csv_reader = csv.reader(f_input)
for row in csv_reader:
task = Entity()
task.PartitionKey = row[0]
task.RowKey = row[1]
task.description = row[2]
task.priority = EntityProperty(EdmType.INT32, row[3])
task.logtime = EntityProperty(EdmType.DATETIME, row[4])
table_service.insert_entity('tasktable', task)
Or you could commit feedback here.
Hope it helps you.
Just for minor update:
If you use python 3.1, there is no need for reload(sys) and sys.setdefaultencoding('utf-8') with 'r' filename = r"E:/jay.csv"
Related
Does the google firestore database service provides a backup?
If so, how do I backup the database and how do I restore in case of an error?
Update: It is now possible to backup and restore Firebase Firestore using Cloud Firestore managed export and import service
You do it by:
Create a Cloud Storage bucket for your project - Make sure it's a regional in us-central1 or 2 / multi regional type of bucket
Set up gcloud for your project using gcloud config set project [PROJECT_ID]
EXPORT
Export all by calling
gcloud firestore export gs://[BUCKET_NAME]
Or Export a specific collection using
gcloud firestore export gs://[BUCKET_NAME] --collection-ids='[COLLECTION_ID_1]','[COLLECTION_ID_2]'
IMPORT
Import all by calling
gcloud firestore import gs://[BUCKET_NAME]/[EXPORT_PREFIX]/
where [BUCKET_NAME] and [EXPORT_PREFIX] point to the location of your export files. For example - gcloud firestore import gs://exports-bucket/2017-05-25T23:54:39_76544/
Import a specific collection by calling:
gcloud firestore import --collection-ids='[COLLECTION_ID_1]','[COLLECTION_ID_2]' gs://[BUCKET_NAME]/[EXPORT_PREFIX]/
Full instructions are available here:
https://firebase.google.com/docs/firestore/manage-data/export-import
Update July 2018: Cloud Firestore now supports managed import and export of data. See the documentation for more details:
https://firebase.google.com/docs/firestore/manage-data/export-import
[Googler here] No, right now we do not offer a managed backup or import/export service. This is something we will definitely offer in the future, we just did not get it ready for the initial Beta release.
The best way to back up right now is to write your own script using our Java/Python/Node.js/Go server SDKs, it should be fairly straightforward to download all documents from each collection and write them back if you need to.
https://www.npmjs.com/package/firestore-backup
Is a tool that has been created to do just this.
(I did not create it, just adding it here as people will find this question)
Local backups
firestore-import-export
This is the one I use for "one-off", local backups, and what I generally recommend. (most straight-forward if you want a single JSON file)
firestore-backup-restore
Drawbacks:
Hasn't been updated in a long time.
Additional options: (not recommended)
python-firebase-admin-firestore-backup
Drawbacks:
Backup only; cannot restore from the backups it creates.
Hasn't been updated in a long time.
firestore-backup
Drawbacks:
Backup only; cannot restore from the backups it creates.
Cloud backups
The official gcloud backup commands.
Drawbacks:
The backup files are difficult/infeasible to parse. (update: how to convert to a json file)
You have to set up the gcloud cli. (update: or use the cloud shell to run the commands)
It doesn't backup locally; instead, it backs up to the cloud, which you can then download. (could also be considered an advantage, depending on what you want)
Note that for the gcloud backup commands, you have multiple options on how to schedule them to run automatically. A few options are shown here.
I am using the following work-around in order to have daily firestore backups:
I installed this globally: https://www.npmjs.com/package/firestore-backup-restore
I have a cron job that looks like this:
0 12 * * * cd ~/my/backup/script/folder && ./backup-script.sh
And my backup-script.sh looks like this:
#!/bin/sh
. ~/.bash_profile
export PATH=/usr/local/bin/
dt=$(/bin/date '+%d-%m-%Y %H:%M:%S');
echo "starting backup for $dt"
firestore-backup-restore -a ~/path/to/account/credentials/file.json -B ./backups/"$dt"
I've written a tool that traverses the collections/documents of the database and exports everything into a single json file. Plus, it will import the same structure as well (helpful for cloning/moving Firestore databases). It's published as an NPM package. Feel free to try it and give some feedback.
https://www.npmjs.com/package/node-firestore-import-export
I had the same issue and created a small npm package which allows you to create a scheduled backup with Cloud Functions. It uses the new import/export feature of Firestore.
const firestoreBackup = require('simple-firestore-backup')
exports.firestore_backup = functions.pubsub.schedule('every 24 hours').onRun(firestoreBackup.createBackupHandler())
Checkout the full readme on how to set it up, it's super simple!
A solution using Python 2.
Fork it on https://github.com/RobinManoli/python-firebase-admin-firestore-backup
First install and setup Firebase Admin Python SDK: https://firebase.google.com/docs/admin/setup
Then install it in your python environment:
pip install firebase-admin
Install the Firestore module:
pip install google-cloud-core
pip install google-cloud-firestore
(from ImportError: Failed to import the Cloud Firestore library for Python)
Python Code
# -*- coding: UTF-8 -*-
import firebase_admin
from firebase_admin import credentials, firestore
import json
cred = credentials.Certificate('xxxxx-adminsdk-xxxxx-xxxxxxx.json') # from firebase project settings
default_app = firebase_admin.initialize_app(cred, {
'databaseURL' : 'https://xxxxx.firebaseio.com'
})
db = firebase_admin.firestore.client()
# add your collections manually
collection_names = ['myFirstCollection', 'mySecondCollection']
collections = dict()
dict4json = dict()
n_documents = 0
for collection in collection_names:
collections[collection] = db.collection(collection).get()
dict4json[collection] = {}
for document in collections[collection]:
docdict = document.to_dict()
dict4json[collection][document.id] = docdict
n_documents += 1
jsonfromdict = json.dumps(dict4json)
path_filename = "/mypath/databases/firestore.json"
print "Downloaded %d collections, %d documents and now writing %d json characters to %s" % ( len(collection_names), n_documents, len(jsonfromdict), path_filename )
with open(path_filename, 'w') as the_file:
the_file.write(jsonfromdict)
Here is my Android Java code for get backup easily for any fire store data collection
First use this method to read the collection data and store in it to serialized file in the mobile device storage
private void readCollection(){
ServerSide.db.collection("Collection_name")
.get()
.addOnCompleteListener(new OnCompleteListener<QuerySnapshot>() {
#Override
public void onComplete(#NonNull Task<QuerySnapshot> task) {
if (task.isSuccessful()) {
HashMap alldata = new HashMap();
for (QueryDocumentSnapshot document : task.getResult()) {
alldata.put(document.getId(),document.getData());
// ServerSide.db.collection("A_Sentences_test").document(document.getId())
// .set(document.getData());
}
try {
FileOutputStream fos = openFileOutput("filename.txt", Context.MODE_PRIVATE);
ObjectOutputStream os = new ObjectOutputStream(fos);
os.writeObject(alldata);
os.close();
fos.close();
Toast.makeText(MainActivity.this, "Stored", Toast.LENGTH_SHORT).show();
FileInputStream fis = openFileInput("filename.txt");
ObjectInputStream is = new ObjectInputStream(fis);
HashMap ad = (HashMap) is.readObject();
is.close();
fis.close();
Log.w("All data",ad+" ");
}catch (Exception e){
Log.w("errrrrrrrr",e+"");
}
} else {
Log.d("Colllllllllll", "Error getting documents: ", task.getException());
}
}
});
}
After that you can check the logcat whether the data is serialized correctly. and here is the restore code
private void writeData(){
try {
FileInputStream fis = openFileInput("filename.txt");
ObjectInputStream is = new ObjectInputStream(fis);
HashMap ad = (HashMap) is.readObject();
is.close();
fis.close();
for (Object s : ad.keySet()){
ServerSide.db.collection("Collection_name").document(s.toString())
.set(ad.get(s));
}
Log.w("ddddddddd",ad+" ");
}catch (Exception e){
e.printStackTrace();
}
}
Hope this would help
Question is old, projects are nice but I have some concerns about the backup.
1-For blaze plan users (free) official solution is off-limit.
2-Since Free users have 50k read quota per day that limit could be a problem in live and large databases.
3-As far as I examined most of the projects does not have a time limit or so, downloading same data every time it is run.
4-Wouldn't it be better to save collections as folders and every document as seperate file and and fetch only updated documents and replace file directly.
I will probably implement my own solution but just wondering your thoughts :)
I am generally new to programming, with a large capacity for self-learning. I have found myself, after taking Harvard's CS50 program unable to use a database in FLASK (using python).
I have created the database with sqlite and have it saved to my work environment and a copy in exterior folders.
I would not like to use FLASK ALCHEMY as I am 100% comfortable in SQL and would not like to start defamiliarizing myself with the basic usage.
I am using visual studio and have Flask already properly installed, and being used to define working routes already.
The database is named trial.db and it can be assumed to have been properly set-up.
import os
from cs50 import SQL
from flask import Flask, flash, redirect, render_template, request,
session, jsonify
from flask_session import Session
from tempfile import mkdtemp
from werkzeug.exceptions import default_exceptions, HTTPException,
InternalServerError
from werkzeug.security import check_password_hash,
generate_password_hash
from helpers import apology, login_required, lookup, usd
app = Flask(__name__)
app.config["TEMPLATES_AUTO_RELOAD"] = True
#app.after_request
def after_request(response):
response.headers["Cache-Control"] = "no-cache, no-store, must-
revalidate"
response.headers["Expires"] = 0
response.headers["Pragma"] = "no-cache"
return response
app.jinja_env.filters["usd"] = usd
app.config["SESSION_FILE_DIR"] = mkdtemp()
app.config["SESSION_PERMANENT"] = False
app.config["SESSION_TYPE"] = "filesystem"
Session(app)
db = SQL("sqlite:///finance.db")
** THE ABOVE CODE is what I am use to calling based on CS50s library, which is excessively generous.**
SQL, as above is utilized like such:
cs50.SQL(url)
Parameters
url – a str that indicates database dialect and connection arguments
Returns
a cs50.SQL object that represents a connection to a database
Example usage:
db = cs50.SQL("sqlite:///file.db") # For SQLite, file.db must exist
Please help, and Thank you
Does the google firestore database service provides a backup?
If so, how do I backup the database and how do I restore in case of an error?
Update: It is now possible to backup and restore Firebase Firestore using Cloud Firestore managed export and import service
You do it by:
Create a Cloud Storage bucket for your project - Make sure it's a regional in us-central1 or 2 / multi regional type of bucket
Set up gcloud for your project using gcloud config set project [PROJECT_ID]
EXPORT
Export all by calling
gcloud firestore export gs://[BUCKET_NAME]
Or Export a specific collection using
gcloud firestore export gs://[BUCKET_NAME] --collection-ids='[COLLECTION_ID_1]','[COLLECTION_ID_2]'
IMPORT
Import all by calling
gcloud firestore import gs://[BUCKET_NAME]/[EXPORT_PREFIX]/
where [BUCKET_NAME] and [EXPORT_PREFIX] point to the location of your export files. For example - gcloud firestore import gs://exports-bucket/2017-05-25T23:54:39_76544/
Import a specific collection by calling:
gcloud firestore import --collection-ids='[COLLECTION_ID_1]','[COLLECTION_ID_2]' gs://[BUCKET_NAME]/[EXPORT_PREFIX]/
Full instructions are available here:
https://firebase.google.com/docs/firestore/manage-data/export-import
Update July 2018: Cloud Firestore now supports managed import and export of data. See the documentation for more details:
https://firebase.google.com/docs/firestore/manage-data/export-import
[Googler here] No, right now we do not offer a managed backup or import/export service. This is something we will definitely offer in the future, we just did not get it ready for the initial Beta release.
The best way to back up right now is to write your own script using our Java/Python/Node.js/Go server SDKs, it should be fairly straightforward to download all documents from each collection and write them back if you need to.
https://www.npmjs.com/package/firestore-backup
Is a tool that has been created to do just this.
(I did not create it, just adding it here as people will find this question)
Local backups
firestore-import-export
This is the one I use for "one-off", local backups, and what I generally recommend. (most straight-forward if you want a single JSON file)
firestore-backup-restore
Drawbacks:
Hasn't been updated in a long time.
Additional options: (not recommended)
python-firebase-admin-firestore-backup
Drawbacks:
Backup only; cannot restore from the backups it creates.
Hasn't been updated in a long time.
firestore-backup
Drawbacks:
Backup only; cannot restore from the backups it creates.
Cloud backups
The official gcloud backup commands.
Drawbacks:
The backup files are difficult/infeasible to parse. (update: how to convert to a json file)
You have to set up the gcloud cli. (update: or use the cloud shell to run the commands)
It doesn't backup locally; instead, it backs up to the cloud, which you can then download. (could also be considered an advantage, depending on what you want)
Note that for the gcloud backup commands, you have multiple options on how to schedule them to run automatically. A few options are shown here.
I am using the following work-around in order to have daily firestore backups:
I installed this globally: https://www.npmjs.com/package/firestore-backup-restore
I have a cron job that looks like this:
0 12 * * * cd ~/my/backup/script/folder && ./backup-script.sh
And my backup-script.sh looks like this:
#!/bin/sh
. ~/.bash_profile
export PATH=/usr/local/bin/
dt=$(/bin/date '+%d-%m-%Y %H:%M:%S');
echo "starting backup for $dt"
firestore-backup-restore -a ~/path/to/account/credentials/file.json -B ./backups/"$dt"
I've written a tool that traverses the collections/documents of the database and exports everything into a single json file. Plus, it will import the same structure as well (helpful for cloning/moving Firestore databases). It's published as an NPM package. Feel free to try it and give some feedback.
https://www.npmjs.com/package/node-firestore-import-export
I had the same issue and created a small npm package which allows you to create a scheduled backup with Cloud Functions. It uses the new import/export feature of Firestore.
const firestoreBackup = require('simple-firestore-backup')
exports.firestore_backup = functions.pubsub.schedule('every 24 hours').onRun(firestoreBackup.createBackupHandler())
Checkout the full readme on how to set it up, it's super simple!
A solution using Python 2.
Fork it on https://github.com/RobinManoli/python-firebase-admin-firestore-backup
First install and setup Firebase Admin Python SDK: https://firebase.google.com/docs/admin/setup
Then install it in your python environment:
pip install firebase-admin
Install the Firestore module:
pip install google-cloud-core
pip install google-cloud-firestore
(from ImportError: Failed to import the Cloud Firestore library for Python)
Python Code
# -*- coding: UTF-8 -*-
import firebase_admin
from firebase_admin import credentials, firestore
import json
cred = credentials.Certificate('xxxxx-adminsdk-xxxxx-xxxxxxx.json') # from firebase project settings
default_app = firebase_admin.initialize_app(cred, {
'databaseURL' : 'https://xxxxx.firebaseio.com'
})
db = firebase_admin.firestore.client()
# add your collections manually
collection_names = ['myFirstCollection', 'mySecondCollection']
collections = dict()
dict4json = dict()
n_documents = 0
for collection in collection_names:
collections[collection] = db.collection(collection).get()
dict4json[collection] = {}
for document in collections[collection]:
docdict = document.to_dict()
dict4json[collection][document.id] = docdict
n_documents += 1
jsonfromdict = json.dumps(dict4json)
path_filename = "/mypath/databases/firestore.json"
print "Downloaded %d collections, %d documents and now writing %d json characters to %s" % ( len(collection_names), n_documents, len(jsonfromdict), path_filename )
with open(path_filename, 'w') as the_file:
the_file.write(jsonfromdict)
Here is my Android Java code for get backup easily for any fire store data collection
First use this method to read the collection data and store in it to serialized file in the mobile device storage
private void readCollection(){
ServerSide.db.collection("Collection_name")
.get()
.addOnCompleteListener(new OnCompleteListener<QuerySnapshot>() {
#Override
public void onComplete(#NonNull Task<QuerySnapshot> task) {
if (task.isSuccessful()) {
HashMap alldata = new HashMap();
for (QueryDocumentSnapshot document : task.getResult()) {
alldata.put(document.getId(),document.getData());
// ServerSide.db.collection("A_Sentences_test").document(document.getId())
// .set(document.getData());
}
try {
FileOutputStream fos = openFileOutput("filename.txt", Context.MODE_PRIVATE);
ObjectOutputStream os = new ObjectOutputStream(fos);
os.writeObject(alldata);
os.close();
fos.close();
Toast.makeText(MainActivity.this, "Stored", Toast.LENGTH_SHORT).show();
FileInputStream fis = openFileInput("filename.txt");
ObjectInputStream is = new ObjectInputStream(fis);
HashMap ad = (HashMap) is.readObject();
is.close();
fis.close();
Log.w("All data",ad+" ");
}catch (Exception e){
Log.w("errrrrrrrr",e+"");
}
} else {
Log.d("Colllllllllll", "Error getting documents: ", task.getException());
}
}
});
}
After that you can check the logcat whether the data is serialized correctly. and here is the restore code
private void writeData(){
try {
FileInputStream fis = openFileInput("filename.txt");
ObjectInputStream is = new ObjectInputStream(fis);
HashMap ad = (HashMap) is.readObject();
is.close();
fis.close();
for (Object s : ad.keySet()){
ServerSide.db.collection("Collection_name").document(s.toString())
.set(ad.get(s));
}
Log.w("ddddddddd",ad+" ");
}catch (Exception e){
e.printStackTrace();
}
}
Hope this would help
Question is old, projects are nice but I have some concerns about the backup.
1-For blaze plan users (free) official solution is off-limit.
2-Since Free users have 50k read quota per day that limit could be a problem in live and large databases.
3-As far as I examined most of the projects does not have a time limit or so, downloading same data every time it is run.
4-Wouldn't it be better to save collections as folders and every document as seperate file and and fetch only updated documents and replace file directly.
I will probably implement my own solution but just wondering your thoughts :)
I am working in an Ionic 3 project with ts to integrate Firebase into my app.
The below code I used to integrate firebase with Ionic project
constructor(angFire: AngularFireDatabase){
}
books: FirebaseListObservable<any>;
To send the data from my app to firebase, I used push method and to update entries I used update($key). Now I have all the data's in Firebase backend.
Now, how can I sync the firebase Database with Google Sheets so that each and every entry added to firebase backend has to get updated into sheets. I used a third party ZAPIER for this integration, but it would be nice if I get to learn on how to do this sync on my own.
Upon surfing, there are many tutorials to get the data's from the google sheets into Firebase. But I didn't come across any tutorials for vice versa.
I followed the below tutorial but it doesn't point to spreadsheets.
https://sites.google.com/site/scriptsexamples/new-connectors-to-google-services/firebase
Any help would be greatly appreciated!
I looked into importing Firebase right into Google Scripts either through the JavaScript SDK or or the REST API. Both have requirements/steps that Google Scripts cannot satisfy or that are extremely difficult to satisfy.
There is no foreseeable method of downloading the JavaScript SDK inside a Google Script because almost every method requires a DOM, which you don't have with a Google Sheet.
The REST API requires GoogleCredentials which, at a short glance, appear very difficult to get inside Google Scripts as well
So, the other option is to interact with Firebase in a true server side environment. This would be a lot of code, but here are the steps that I would take:
1) Setup a Pyrebase project so you can interact with your Firebase project via Python.
import pyrebase
config = {
"apiKey": "apiKey",
"authDomain": "projectId.firebaseapp.com",
"databaseURL": "https://databaseName.firebaseio.com",
"storageBucket": "projectId.appspot.com",
"serviceAccount": "path/to/serviceAccountCredentials.json"
}
firebase = pyrebase.initialize_app(config)
...
db = firebase.database()
all_users = db.child("users").get()
2) Setup a Google Scripts/Sheets project as a class that can interact with your Google Sheet
from __future__ import print_function
import httplib2
import os
from apiclient import discovery
from oauth2client import client
from oauth2client import tools
from oauth2client.file import Storage
try:
import argparse
flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
flags = None
# If modifying these scopes, delete your previously saved credentials
# at ~/.credentials/sheets.googleapis.com-python-quickstart.json
SCOPES = 'https://www.googleapis.com/auth/spreadsheets.readonly'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Google Sheets API Python Quickstart'
class GoogleSheets:
...
# The rest of the functions from that link can go here
...
def write(self, sheet, sheet_name, row, col):
"""
Write data to specified google sheet
"""
if sheet == None or sheet == "":
print("Sheet not specified.")
return
day = time.strftime("%m/%d/%Y")
clock = time.strftime("%H:%M:%S")
datetime = day + " - " + clock
values = [[datetime]]
spreadsheetId = sheet
rangeName = sheet_name + "!" + str(row) + ":" + str(col)
body = {
'values': values
}
credentials = self.get_credentials()
http = credentials.authorize(httplib2.Http())
discoveryUrl = ('https://sheets.googleapis.com/$discovery/rest?'
'version=v4')
service = discovery.build('sheets', 'v4', http=http,
discoveryServiceUrl=discoveryUrl)
result = service.spreadsheets().values().update(
spreadsheetId=spreadsheetId, range=rangeName,
valueInputOption="RAW", body=body).execute()
3) Call the Google Sheets somewhere inside your Pyrebase project
from GoogleSheets import GoogleSheets
...
g = GoogleSheets()
g.write(<project-id>, <sheet-name>, <row>, <col>)
...
4) Set up a cron job to run the python script every so often
# every 2 minutes
*/2 * * * * /root/my_projects/file_example.py
You will need some basic server (Heroku, Digital Ocean) to run this.
This is not extensive because there is a lot of code to be written, but you could get the basics done. Makes we want to make a package now.
You can go for Zapier which is a 3rd party service through which you can easily integrate your Firebase and Google spreadsheets and vice versa. It has also got some support for google docs and other features.
https://zapier.com/zapbook/firebase/google-sheets/
Firebase can't be used as a trigger in Zapier, only as an action, so you can't send data from it to Google Sheets.
This might be a dumb question, but I'm kind of confused as to how SQLAlchemy works with the actual database being used by my Flask application. I have a python file, models.py that defines a SQLAlchemy database schema, and then I have this part of my code that creates the database for it
if __name__ == '__main__':
from datetime import timedelta
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
engine = create_engine('sqlite://', echo=True)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
# Add a sample user
user = User(name='Philip House', password="test")
session.add(user)
session.commit()
I run that file and it works fine, but now I'm confused to as what happens with the database..how can I access it in another application? I've also heard that it might just be in memory, and if that is the case, how do I make it a permanent database file I can use my application with?
Also in my application, this is how I refer to my sqlite database in the config file:
PWD = os.path.abspath(os.curdir)
DEBUG=True
SQLALCHEMY_DATABASE_URI = 'sqlite:///{}/arkaios.db'.format(PWD)
I dunno if that might be of any help.
Thanks!!
Here are the docs for connection to SQLAlchemy with SQLite.
As you guessed, you are in fact creating a SQLite database in memory when you use sqlite:// as your connection string. If you were to use sqlite:///{}/arkaios.db'.format(PWD) you would create a new database in your current directory. If this is what you intend to do so that you can access that database from other applications then you should import your connections string from your configuration file and use that instead of sqlite://.