Currently using the google cloud vision api for pulling text from images of documents.
Current situation - the API works great, and returns tons of data including the bounding boxes of where the words are located.
Desired outcome - to query only the words pulled from the image and not all the meta data about where the bounding boxes and vertices of the words are (it's like 99% of the response and comes out to be about 250k which is a huge waste when all I want are just the words)
const vision = require('#google-cloud/vision');
const client = new vision.ImageAnnotatorClient();
// Performs label detection on the image file
client
.documentTextDetection('../assets/images_to_ocr/IMG_0942-min.jpg')
.then(results => {
console.log('result:', result);
})
.catch(err => {
console.error('ERROR:', err);
});
For now, the Google Cloud Vision client library for nodeJS does not have an option for requesting partial responses like the ones you are asking.
Anyway, if you just want to show the text and not any of the other metadata, you can filter the response like this:
const fullTextAnnotation = results[0].fullTextAnnotation;
console.log(`Full text: ${fullTextAnnotation.text}`);
You will get the full response in 'fullTextAnnotation', then you can get fullTextAnnotation.text to get only the text with ā\nā characters to separate the text blocks, without any metadata.
In case you are interested in using something else instead of nodeJS, the Java client library has the setFields() method for the Annotate class and also from the API Explorer you can use a partial fields mask to see the effect.
Related
Background
I'm trying to upload images to firebase storage manually (using the upload file button in the web page), however I have no clue how to later link them to a firestore document. What I have come up with (I'm unsure if it works) is copying the url for the image in the storage bucket and adding it to a string type field in the document called profilePicture. The reason I'm unable to get this to work is that I'm really new to React Native and I don't know how to properly require the images other than typing in the specific local route. Mind you also, the way I'm requiring user data such as a profile name is after logging in with email/password auth I pass the data as a param to react navigation and require it as extraData.
What I have tried
Once I've copied the image url and pasted it in the firestore document I'm doing this:
const profilePicture = props.extraData.profilePicture;
<Image source={require({profilePicture})}/>
I have also tried using backticks but that isn't working either. The error message I'm getting is:
TransformError src\screens\Profile\ProfileScreen.js: src\screens\Profile\ProfileScreen.js:Invalid call at line 27: require({
profilePicture: profilePicture
})
Note: this is an expo managed project.
Question
Is the problem in the code or in the way I'm linking both images? Maybe both? Should I require the document rather than relying on the data passed previously?
Thanks a lot in advance!
Edit 1:
I'm trying to get all info from the current user signed in, after a little research I've come to know about requiring images in this manner:
const ref = firebase.storage().ref('path/to/image.jpg');
const url = await ref.getDownloadURL();
and then I'd require the image as in <Image source={{uri: url}}/>
I get that this could be useful for something static, but I don't get how to update the ref for every single different user.
Edit 2:
Tried using the method mentioned in Edit 1, just to see what would happen, however It doesn't seem to work, the image just does not show up.
Maybe because my component is a function component rather than a class component (?
I understand that your goal is to generate, for each image that is uploaded to Cloud Storage, a Firestore document which contains a download URL.
If this is correct, one way is to use a Cloud Function that is triggered each time a new file is added to Cloud Storage. The following Cloud Function code does exactly that. You may adapt it to your exact requirements.
exports.generateFileURL = functions.storage.object().onFinalize(async object => {
try {
const bucket = admin.storage().bucket(object.bucket);
const file = bucket.file(object.name);
// You can check that the file is an image
const signedURLconfig = { action: 'read', expires: '08-12-2025' }; // Adapt as follows
const signedURLArray = await file.getSignedUrl(signedURLconfig);
const url = signedURLArray[0];
await admin.firestore().collection('profilePictures').add({ fileName: object.name, signedURL: url }) // Adapt the fields list as desired
return null;
} catch (error) {
console.log(error);
return null;
}
});
More info on the getSignedUrl() method of the Admin SDK here.
Also note that you could assign the Firestore document ID yourself, instead of having Firestore generating it as shown in the above code (with the add() method). For example, you can add to the image metadata the uid of the user and, in the Cloud Function,get this value and use this value as the Document ID.
Another possibility is to name the profile image with the user's uid.
My application use keywords extensively, everything is tagged with keywords, so whenever use wants to search data or add data I have to show keywords in auto complete box.
As of now I am storing keywords in another collection as below
export interface IKeyword {
Id:string;
Name:string;
CreatedBy:IUserMin;
CreatedOn:firestore.Timestamp;
}
export interface IUserMin {
UserId:string;
DisplayName:string;
}
export interface IKeywordMin {
Id:string;
Name:string;
}
My main document holds array of Keywords
export interface MainDocument{
Field1:string;
Field2:string;
........
other fields
........
Keywords:IKeywordMin[];
}
But problem is auto complete reads data frequently and my document reads quota increases very fast.
Is there a way to implement this without increasing reads for keyword ? Because keyword is not the real data we need to get.
Below is my query to get main documents
query = query.where("Keywords", "array-contains-any", keywords)
I use below query to get keywords in auto complete text box
query = query.orderBy("Name").startAt(searchTerm).endAt(searchTerm+ '\uf8ff').limit(20)
this query run many times when user types auto complete search which is causing more document reads
Does this answer your question
https://fireship.io/lessons/typeahead-autocomplete-with-firestore/
Though the receommended solution is to use 3rd party tool
https://firebase.google.com/docs/firestore/solutions/search
To reduce documents read:
A solution that come to my mind however I'm not sure if it's suitable for your use case is using Firestore caching feature. By default, firestore client will always try to reach the server to get the new changes on your documents and if it cannot reach the server, it will reach to the cached data on the client device. you can take advantage of this feature by using the cache first and reach the server only when you want. For web application, this feature is disabled by default and you can enable it like in
https://firebase.google.com/docs/firestore/manage-data/enable-offline
to help you understand this feature more check this article:
https://firebase.google.com/docs/firestore/manage-data/enable-offline
I found a solution, thought I would share here
Create a new collection named typeaheads in below format
export interface ITypeAHead {
Prefix:string;
CollectionName:string;
FieldName:string;
MatchingValues:ILookupItem[]
}
export interface ILookupItem {
Key:string;
Value:string;
}
depending on the minimum letters add either 2 or 3 letters to Prefix, and search based on the prefix, collection and field. so most probably you will end up with 2 or 3 document reads for on search.
Hope this helps someone else.
I am using firebase for data storage. The data structure is like this:
products:{
product1:{
name:"chocolate",
}
product2:{
name:"chochocho",
}
}
I want to perform an auto complete operation for this data, and normally i write the query like this:
"select name from PRODUCTS where productname LIKE '%" + keyword + "%'";
So, for my situation, for example, if user types "cho", i need to bring both "chocolate" and "chochocho" as result. I thought about bringing all data under "products" block, and then do the query at the client, but this may need a lot of memory for a big database. So, how can i perform sql LIKE operation?
Thanks
Update: With the release of Cloud Functions for Firebase, there's another elegant way to do this as well by linking Firebase to Algolia via Functions. The tradeoff here is that the Functions/Algolia is pretty much zero maintenance, but probably at increased cost over roll-your-own in Node.
There are no content searches in Firebase at present. Many of the more common search scenarios, such as searching by attribute will be baked into Firebase as the API continues to expand.
In the meantime, it's certainly possible to grow your own. However, searching is a vast topic (think creating a real-time data store vast), greatly underestimated, and a critical feature of your application--not one you want to ad hoc or even depend on someone like Firebase to provide on your behalf. So it's typically simpler to employ a scalable third party tool to handle indexing, searching, tag/pattern matching, fuzzy logic, weighted rankings, et al.
The Firebase blog features a blog post on indexing with ElasticSearch which outlines a straightforward approach to integrating a quick, but extremely powerful, search engine into your Firebase backend.
Essentially, it's done in two steps. Monitor the data and index it:
var Firebase = require('firebase');
var ElasticClient = require('elasticsearchclient')
// initialize our ElasticSearch API
var client = new ElasticClient({ host: 'localhost', port: 9200 });
// listen for changes to Firebase data
var fb = new Firebase('<INSTANCE>.firebaseio.com/widgets');
fb.on('child_added', createOrUpdateIndex);
fb.on('child_changed', createOrUpdateIndex);
fb.on('child_removed', removeIndex);
function createOrUpdateIndex(snap) {
client.index(this.index, this.type, snap.val(), snap.name())
.on('data', function(data) { console.log('indexed ', snap.name()); })
.on('error', function(err) { /* handle errors */ });
}
function removeIndex(snap) {
client.deleteDocument(this.index, this.type, snap.name(), function(error, data) {
if( error ) console.error('failed to delete', snap.name(), error);
else console.log('deleted', snap.name());
});
}
Query the index when you want to do a search:
<script src="elastic.min.js"></script>
<script src="elastic-jquery-client.min.js"></script>
<script>
ejs.client = ejs.jQueryClient('http://localhost:9200');
client.search({
index: 'firebase',
type: 'widget',
body: ejs.Request().query(ejs.MatchQuery('title', 'foo'))
}, function (error, response) {
// handle response
});
</script>
There's an example, and a third party lib to simplify integration, here.
I believe you can do :
admin
.database()
.ref('/vals')
.orderByChild('name')
.startAt('cho')
.endAt("cho\uf8ff")
.once('value')
.then(c => res.send(c.val()));
this will find vals whose name are starting with cho.
source
The elastic search solution basically binds to add set del and offers a get by wich you can accomplish text searches.
It then saves the contents in mongodb.
While I love and reccomand elastic search for the maturity of the project, the same can be done without another server, using only the firebase database.
That's what I mean:
(https://github.com/metaschema/oxyzen)
for the indexing part basically the function:
JSON stringifies a document.
removes all the property names and JSON to leave only the data
(regex).
removes all xml tags (therefore also html) and attributes (remember
old guidance, "data should not be in xml attributes") to leave only
the pure text if xml or html was present.
removes all special chars and substitute with space (regex)
substitutes all instances of multiple spaces with one space (regex)
splits to spaces and cycles:
for each word adds refs to the document in some index structure in
your db tha basically contains childs named with words with childs
named with an escaped version of "ref/inthedatabase/dockey"
then inserts the document as a normal firebase application would do
in the oxyzen implementation, subsequent updates of the document ACTUALLY reads the index and updates it, removing the words that don't match anymore, and adding the new ones.
subsequent searches of words can directly find documents in the words child. multiple words searches are implemented using hits
SQL"LIKE" operation on firebase is possible
let node = await db.ref('yourPath').orderByChild('yourKey').startAt('!').endAt('SUBSTRING\uf8ff').once('value');
This query work for me, it look like the below statement in MySQL
select * from StoreAds where University Like %ps%;
query = database.getReference().child("StoreAds").orderByChild("University").startAt("ps").endAt("\uf8ff");
I have a need to create a copy of a Google Doc with a specific ID - not the "friendly" name like MyDocument, but the name that makes it unique in the GoogleSphere - the one like 1x_tfTiA9-b5UwAf3k2fg6y6hyZSYQIvhSNn-saaDs4c.
Here's the scenario why I would like to do this:
I have a newsletter which is in the form of a Google Doc. The newsletter is published on a website by embedding the document in a web page inside an <iframe> element. Also published in the same way is a "large print" version of the newsletter that is the same, apart from the fact that the default font size is 24pt, rather than 11pt.
I am trying to automate the production of the large print version, but in such a way that the unique ID of the large print document doesn't change, so that the embedded <iframe> for it still works.
I have experimented in the past with Google Apps Scripts routines for creating a deep copy of a document but the deep copy functions don't play nicely with images and tables, so I could never get a complete copy. If I could implement a "Save As" function, where the operand was an existing unique ID, I think this would do what I want.
Anyone know how I might do this?
I delved into this, attempting to set the id of the "large print" version of the file in a variety of ways:
via copy(): var copiedFile = Drive.Files.copy(lpFile, spFile.id, options);
which yields the error:
Generated IDs are not currently supported for copy requests
via insert(): var newFile = Drive.Files.insert(lpFile, doc.getBlob(), options);
which yields the error:
Generated IDs are not supported for Google Docs formats
via update(): Drive.Files.update(lpFile, lpFile.id, doc.getBlob(), options);
This method successfully updates the "large print" file from the small print file. This particular line, however, uses the Document#getBlob() method, which has issues with formatting and rich content from the Document. In particular, as you mention, images and tables in are not preserved (among other things, like changes to the font, etc.). Compare pre with post
It seems that - if the appropriate method of exporting formatted byte content from the document can be found - the update() method has the most promise. Note that the update() method in the Apps Script client library requires a Blob input (i.e. doc.getBlob().getBytes() will not work), so the fundamental limitation may be the (lack of) support for rich format information in the produced Blob data. With this in mind, I tried a couple methods for obtaining "formatted" Blob data from the "small print" file:
via Document#getAs(mimetype): Drive.Files.export(lpFile, lpFile.id, doc.getAs(<type>), options);
which fails for seemingly sensible types with the errors:
MimeType.GOOGLE_DOCS: We're sorry, a server error occurred. Please wait a bit and try again.
MimeType.MICROSOFT_WORD: Converting from application/vnd.google-apps.document to application/vnd.openxmlformats-officedocument.wordprocessingml.document is not supported.
These errors do make sense, since the internal Google Docs MimeType is not exportable (you can't "download as" this filetype since the data is kept however Google wants to keep it), and the documentation for Document#getAs(mimeType) indicates that only PDF export is supported by the Document Service. Indeed, attempting to coerce the Blob from doc.getBlob() with getAs(mimeType) fails, with the error:
Converting from application/pdf to application/vnd.openxmlformats-officedocument.wordprocessingml.document is not supported.
using DriveApp to get the Blob, rather than the Document Service:
Drive.Files.update(lpFile, lpFile.id, DriveApp.getFileById(smallPrintId).getBlob(), options);
This has the same issues as doc.getBlob(), and likely uses the same internal methods.
using DriveApp#getAs has the same errors as Document#getAs
Considering the limitation of the native Apps Script implementations, I then used the advanced service to obtain the Blob data. This is a bit trickier, since the File resource returned is not actually the file, but metadata about the file. Obtaining the Blob with the REST API requires exporting the file to a desired MimeType. We know from above that the PDF-formatted Blob fails to be properly imported, since that is the format used by the above attempts. We also know that the Google Docs format is not exportable, so the only one left is MS Word's .docx.
var blob = getBlobViaURL_(smallPrintId, MimeType.MICROSOFT_WORD);
Drive.Files.update(lpFile, lpFile.id, blob, options);
where getBlobViaURL_ implements the workaround from this SO question for the (still-broken) Drive.Files.export() Apps Script method.
This method successfully updates the existing "large print" file with the exact content from the "small print" file - at least for my test document. Given that it involves downloading content instead of using the internal, already-present data available to the export methods, it will likely fail for larger files.
Testing Script:
function copyContentFromAtoB() {
var smallPrintId = "some id";
var largePrintId = "some other id";
// You must first enable the Drive "Advanced Service" before this will work.
// Get the file metadata of the to-be-updated file.
var lpFile = Drive.Files.get(largePrintId);
// View available options on the relevant Drive REST API pages.
var options = {
updateViewedDate: false,
};
// Ideally this would use Drive.Files.export, but there is a bug in the Apps Script
// client library's implementation: https://issuetracker.google.com/issues/36765129
var blob = getBlobViaURL_(smallPrintId, MimeType.MICROSOFT_WORD);
// Replace the contents of the large print version with that of the small print version.
Drive.Files.update(lpFile, lpFile.id, blob, options);
}
// Below function derived from https://stackoverflow.com/a/42925916/9337071
function getBlobViaURL_(id, mimeType) {
var url = "https://www.googleapis.com/drive/v2/files/"+id+"/export?mimeType="+ mimeType;
var resp = UrlFetchApp.fetch(url, {
headers: { Authorization: 'Bearer ' + ScriptApp.getOAuthToken()}
});
return resp.getBlob();
}
Is it possible to use the javascript api without entirely being dependent to browser DOM to render the map, since react-native use View I feel it's possible to use the api somehow, The method on which to make the api available by passing to global windows also might be possible using fetch, but I don't know how to initiate the callback. If any one got idea how this can work, please share some thoughts
<script async defer
src="https://maps.googleapis.com/maps/api/js?key=YOUR_API_KEY&callback=initMap">
</script>
The googlemaps node module #sunilp mentioned only gives you static maps, which are easy to build without relying on a lib, just search for google maps api static maps (don't have enough reputation to post more than two links).
For Android you have this:
https://github.com/teamrota/react-native-gmaps
For iOS there's this, but it cautions it's a work in progress (it also hasn't been updated in months): https://github.com/peterprokop/ReactNativeGoogleMaps
Looking around, it seem your best (supported) bet for iOS today is to use React Native's own MapView if you search their docs. Then, you can make fetch calls to the Google Maps API services for the data, then populate MapView through its interface. Make sure you pass in your key and use https in your url otherwise the service will deny your call:
fetch(
'https://maps.googleapis.com/maps/api/directions/json?origin=41.13694,-73.359778&destination=41.13546,-73.35997&mode=driver&sensor=true&key=[YOUR_KEY]'
)
.then(
(response) => response.text()
)
.then(
(responseText) => {
console.log(responseText);
}
)
.catch(
(error) => {
console.warn(error);
}
);
`
Use node approach of rendering, not yet tried , but looks like it can be done with node-googlemaps
npm install googlemaps
usage
var GoogleMapsAPI = require('googlemaps');
var publicConfig = {
key: '<YOUR-KEY>',
stagger_time: 1000, // for elevationPath
encode_polylines: false,
secure: true, // use https
proxy: 'http://127.0.0.1:9999' // optional, set a proxy for HTTP requests
};
var gmAPI = new GoogleMapsAPI(publicConfig);
you can refer https://github.com/moshen/node-googlemaps