Programmatic access to Amazon Wishlist? [duplicate] - web-scraping

This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed last month.
Amazon recently changed their APIs which and it seems there's no way now to access my WishList on Amazon programmatically using these APIs. Anybody knows any way to do it besides screen-scraping? Maybe some third-party service (I don't mind working with only public data)?

For screen scraping, the compact layout style might be helpful: http://bililite.com/blog/2010/10/31/hacking-my-way-to-an-amazon-wishlist-widget/
Update
I did some hacking of my own in google spreadsheets and managed to get 2 basic implementations working.
Using Google Apps Scripts:
Type your wishlist ID into cell A1. Copy and paste the following into a google apps script (Tools > Scripts > Scripts Editor), and run the getWishlist function:
function getWishlist(){
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheets()[0];
var wishlistId = sheet.getRange('a1').getValue();
var response = UrlFetchApp.fetch("http://www.amazon.co.uk/registry/wishlist/" + wishlistId + "?layout=compact").getContentText();
var asinRegex = /name="item.([\d]+)\.(?:[A-Z0-9]+).([A-Z0-9]+).*/g
while (match = asinRegex.exec(response)) {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheets()[0];
var rowIndex = Number(match[1])+2;
var asin = match[2];
setRow(sheet, rowIndex, asin);
var offers = UrlFetchApp.fetch("http://www.amazon.co.uk/gp/offer-listing/" + asin).getContentText();
setRow(sheet, rowIndex, asin,
getFirstMatch(/class="producttitle">(.+)</g, offers),
getFirstMatch(/class="price">(.+)</g, offers));
}
Browser.msgBox("Finished");
}
function getFirstMatch(regex, text) {
var match = regex.exec(text);
return (match == null) ? "Unknown" : match[1];
}
function setRow(sheet, index, a, b, c) {
sheet.getRange('a' + index).setValue(a);
sheet.getRange('b' + index).setValue(b);
sheet.getRange('c' + index).setValue(c);
}
​
​
NB, I'm having some probs with regex matching the title / price. Not sure why, but shows the basic idea.
Using Google Spreadsheet Functions
Type your wishlist ID into cell A1.
Type the following function into A2. It will populate the cell and all below it with the id strings for each item in your wishlist:
=importXML("http://www.amazon.co.uk/registry/wishlist/"&A1&"?layout=compact", "//*[starts-with(#name, 'item.')]/#name")
Type the following function into B2, which will extract the asin from the id string:
=right(A2, 10)
Type the following function into B3, which will fetch the offer listing for the asin in B2 and display the title:
=importXML("http://www.amazon.co.uk/gp/offer-listing/"&B2, "//h1")
Type the following function into B4, which will fetch the offer listing for the asin in B2 and display all the prices:
=concatenate(importXML("http://www.amazon.co.uk/gp/offer-listing/"&B2, "//span[#class='price']"))

A guy called Justin Scarpetti has created a really neat "api" which scrapes your wishlist and returns the data in json format.
This is a little API to retrieve Amazon Wish List data. There is no
official API, as Amazon shut it down a couple years ago. The only way
around that... screen scraping.
Amazon Wish Lister uses phpQuery (server-side CSS3 selector driven DOM
API based on jQuery) to scrape Amazon's Wish List page and exports to
JSON, XML, or PHP Array Object.
Perfect if you want to host display your wish list on your own
website.
Source: Amazon Wish Lister

Related

How to efficiently update a field in Firestore from Google Sheets

I am working with Google Sheets, and I am trying to send data to my Firestore database. I have been able to write to Firestore from Google Sheets, but I can't seem to update a field without completely messing things up.
This is my current testing code:
function getFireStore() {
const email = "your#email.gserviceaccount.com"
const key = "-----BEGIN PRIVATE KEY-----\n your key here \n-----END PRIVATE KEY-----\n";
const id = "project_id";
var firestore = FirestoreApp.getFirestore(email, key, id);
var spreadsheet = SpreadsheetApp.getActive()
var sheet = spreadsheet.getActiveSheet()
var data = {
numIndividuals: sheet.getRange(23, individuals).getValue(),
numTeams: sheet.getRange(23, teams).getValue(),
schoolID: sheet.getRange(23, schoolID).getValue(),
uid: sheet.getRange(23, uid).getValue(),
};
firestore.createDocument("competitions/" + sheet.getRange(23, compId).getValue() + "/registration/abcdefg", data)
}
I understand after playing around with this that it will create a new subcollection titled "registration" with the document "abcdefg." The same thing happens when I use the updateDocument function, as well.
For the website that is reading and writing to this particular Firestore database, I use a similar function .update() to update the document with the correct information. However, in Google Sheets, while it would work the same way it is much more convoluted and tedious to do so.
The way that I came up with for trying to update the document was basically copying everything and adding in the new data.
However, this is seriously tedious and messy. Just copying the data that isn't changed looks like this:
var data = {
compDate: competitions.fields.compDate.stringValue,
contact: competitions.fields.contact.stringValue,
email: competitions.fields.email.stringValue,
grade: competitions.fields.grade.stringValue,
id: competitions.fields.id.integerValue,
maxTeams: competitions.fields.maxTeams.integerValue,
regDate: competitions.fields.regDate.stringValue,
schTeams: competitions.fields.schTeams.integerValue,
schedule: competitions.fields.schedule.stringValue,
site: competitions.fields.site.stringValue,
status: competitions.fields.status.stringValue,
timestamp: competitions.fields.timestamp.integerValue,
user: competitions.fields.user.stringValue,
year: competitions.fields.year.stringValue,
}
The data I want to change is a .mapValue with multiple fields where one of the fields can have multiple fields, which also have multiple fields.
Here's the hierarchy for the field I need to update:
first registration and first team
I know I could do multiple for-loops and whatnot on this, but my question is: is there a simpler way to do this, or do I have to go through and loop over everything to extract only what I want?
As a sidenote, what gets sent to Firestore if I put in the data I got from Firestore using the spread operator, without any editing, it includes every child from the above image. As in, I would have registration -> mapValue -> fields -> 0 -> mapValue -> fields -> etc. And, I don't want those mapValue and fields included, just that actual data (i.e. registration -> 0 -> {schoolID, uid, names, etc.}).

Is there a way to extract the XYZ geometry data from a converted Revit model?

I'm creating a solution that converts a revit model to IFC file format using Autodesk Forge - Model Derivative API. This API hands me a JSON file with the hierarchy of the converted model, and a JSON file with all separate objects and their properties.
After converting the model I need to analyze specific properties from parts of the model. But not all information I need is stored in objects' properties. I also need to use XYZ coordinates of objects to get real results, but I believe the model derivative API doesn't generate XYZ data.
I've already searched all the properties of the objects to see if they contain any kind of data about their location in comparison to other objects, but they don't contain that information. I've searched for other ways to extract geometry/coordinates from Revit, but haven't found a real solution.
https://forge.autodesk.com/en/docs/model-derivative/v2/tutorials/extract-metadata-from-source-file/
In step 5 of this tutorial you can see the data that I have (the properties of each object).
There is no way to get the XYZ data from the Model Derivative API the way that you are hoping.
I'd also say that if you are looking to convert to IFC, there is already a conversion service for that in the Model Derivative API. But in case you really need a custom file format, here is how you could get XYZ, below.
There are two other options though that you can consider.
One, is to use the Design Automation for Revit API. You would be able to make an Addin that pulls the needed data from the headless Revit environment.
Another option is to launch a headless Forge Viewer and get the XYZ data of the model from there.
The headless viewer is a tutorial in the Viewer API documentation that you can check out. Here is the code from it (v6) for reference.
var viewerApp;
var options = {
env: 'AutodeskProduction',
accessToken: ''
};
var documentId = 'urn:<YOUR_URN_ID>';
Autodesk.Viewing.Initializer(options, onInitialized);
function onInitialized() {
viewerApp = new Autodesk.Viewing.ViewingApplication('MyViewerDiv');
viewerApp.registerViewer(viewerApp.k3D, Autodesk.Viewing.Viewer3D);
viewerApp.loadDocument(documentId, onDocumentLoaded);
}
function onDocumentLoaded(lmvDoc) {
var modelNodes = viewerApp.bubble.search(av.BubbleNode.MODEL_NODE); // 3D designs
var sheetNodes = viewerApp.bubble.search(av.BubbleNode.SHEET_NODE); // 2D designs
var allNodes = modelNodes.concat(sheetNodes);
if (allNodes.length) {
viewerApp.selectItem(allNodes[0].data);
if (allNodes.length === 1){
alert('This tutorial works best with documents with more than one viewable!');
}
} else {
alert('There are no viewables for the provided URN!');
}
}
Once you're accessing the viewer, here is some code that you can get the bounding box of an element or elements by dbIds that I've used successfully.
/**
* Uses dbId element fragments to build boundingbox of element
* #param {Array<number>} dbIds dbIds of element to find boundingBox
* #return {THREE.Box3} dbId elements bounding box
*/
getBoundingBox(dbIds) {
const totalBox = new THREE.Box3();
dbIds.forEach((dbId) => {
const fragBox = new THREE.Box3();
const fragIds = [];
const instanceTree = viewer3D.model.getInstanceTree();
instanceTree.enumNodeFragments(dbId, function(fragId) {
fragIds.push(fragId);
});
const fragList = viewer3D.model.getFragmentList();
fragIds.forEach(function(fragId) {
fragList.getWorldBounds(fragId, fragBox);
totalBox.union(fragBox);
});
});
return totalBox;
}
From this BoundingBox which is a THREE.Box3 object, you can get some XYZ information about the elements. Also, there is code here using the 'fragments' that will allow you to get different element geometry more specifically if that is more useful for the XYZ you need to define.

How to export a table as google sheet in Google app maker using a button

I've looked extensively and tried to modify multiple sample sets of codes found on different posts in Stack Overflow as well as template documents in Google App Maker, but cannot for the life of me get an export and en email function to work.
UserRecords table:
This is the area where the data is collected and reviewed, the populated table:
These are the data fields I am working with:
This is what the exported Sheet looks like when I go through the motions and do an export through the Deployment tab:
Lastly, this is the email page that I've built based on tutorials and examples I've seen:
What I've learned so far (based on the circles I'm going round in):
Emails seem mostly straight forward, but I don't need to send a message, just an attachment with a subject, similar to using the code:
function sendEmail_(to, subject, body) {
var emailObj = {
to: to,
subject: subject,
htmlBody: body,
noReply: true
};
MailApp.sendEmail(emailObj);
}
Not sure how to change the "body" to the exported document
To straight up export and view the Sheet from a button click, the closest I've found to a solution is in Document Sample but the references in the code speak to components on the page only. I'm not sure how to modify this to use the table, and also what to change to get it as a sheet instead of a doc.
This may seem trivial to some but I'm a beginner and am struggling to wrap my head around what I'm doing wrong. I've been looking at this for nearly a week. Any help will be greatly appreciated.
In it's simplest form you can do a Google sheet export with the following server script (this is based on a model called employees):
function exportEmployeeTable() {
//if only certain roles or individuals can perform this action include proper validation here
var query = app.models.Employees.newQuery();
var results = query.run();
var fields = app.metadata.models.Employees.fields;
var data = [];
var header = [];
for (var i in fields) {
header.push(fields[i].displayName);
}
data.push(header);
for (var j in results) {
var rows = [];
for (var k in fields) {
rows.push(results[j][fields[k].name]);
}
data.push(rows);
}
if (data.length > 1) {
var ss = SpreadsheetApp.create('Employee Export');
var sheet = ss.getActiveSheet();
sheet.getRange(1,1,data.length,header.length).setValues(data);
//here you could return the URL for your spreadsheet back to your client by setting up a successhandler and failure handler
return ss.getUrl();
} else {
throw new app.ManagedError('No Data to export!');
}
}

Populating Array Data from Data Layer into GTM

I have a Data Layer that is giving me information like this from Drupal
dataLayer = [{
"entityType":"node",
"entityBundle":"article",
"entityTaxonomy":
{"funnel_path":{"2":"Find a Park"},
"byline":{"4":"Name1","5":"Name2"}},"drupalLanguage":"en",
"userUid":"1"}
];
</script>
I can easily use GTM's Data Layer variable to pull in entityBundle. How do I set it to pull in the information in byline? I tried entityTaxonomy.byline, but that give me an array. I can set to do entityTaxonomy.byline.4 to get Name1, but that would be silly since the editors would be regularly adding things.
I am planning to add the byline, ultimately, into Custom Dimension 2 in Google Analytics.
I am looking to have the data that goes to Custom Dimension 2 to be Name1, Name2 . Sometimes this will be just one value. Sometimes it can be up to 20 values.
What do I need to do in GTM to get it to register that information?
entityTaxonomy.byline actually gives you an object. You would need to do a bit of processing to get an array that you can join into a string. One possible way would be
temp = [];
Object.keys(test.entityTaxonomy.byline).map(function(key, index) {
temp.push(test.entityTaxonomy.byline[key]);
});
bylines = temp.join(",")
(I'm sure that could be done much more concise). In GTM you would need to create a variable that contains the objects with the bylines, then you could do the processing in a custom javascript variable (which is by definition an anonymous function with a return value)
function() {
var byLineObject = {{bylines}} // created as datalayer var beforehand
temp = [];
Object.keys(byLineObject).map(function(key, index) {
temp.push(byLineObject[key]);
});
return temp.join(",")
}

pagination in alfresco

I am working on an application which lists and searches document from alfresco. The issue is the alfresco can return upto 5000 records per query. but I don't want my application to list down all documents instead if I can some how implement pagination in alfresco, so that alfresco only return X result per page. I am using Alfresco 4 enterprise edition.
Any help or suggestion please.
UPDATE (Example)
I have written a web script which executes the query and returns all the documents satisfies the condition. Lets say, there are 5000 entries found. I want to modify my web script in a way that the web script returns 100 documents for 1st page, next 100 for second page and so on...
It'll be something like usage of Limit BY and OFFSET keywords. something like this
There are two ways to query on the SearchService (excluding the selectNodes/selectProperties calls). One way is to specify all your arguments directly to the query method. This has the advantage of being concise, but the disadvantage is that you don't get all the options.
Alternately, you can query with a SearchParameters object. This lets you do everything the simple query does, and more. Included in that more are setLimit, setSkipCount and setMaxItems, which will allow you to do your paging.
If your query used to be something like:
searchService.query(StoreRef.STORE_REF_WORKSPACE_SPACESSTORE, "lucene", myQuery);
You'd instead do something like:
SearchParameters sp = new SearchParameters();
sp.addStore(StoreRef.STORE_REF_WORKSPACE_SPACESSTORE);
sp.setLanguage("lucene");
sp.setQuery(myQuery);
sp.setMaxItems(100);
sp.setSkipCount(900);
searchService.query(sp);
Assuming you have written your webscript in Javascript you can use the search.query() function and add the page property to the search definition as shown below:
var sort1 = {
column: "#{http://www.alfresco.org/model/content/1.0}modified",
ascending: false
};
var sort2 = {
column: "#{http://www.alfresco.org/model/content/1.0}created",
ascending: false
};
var paging = {
maxItems: 100,
skipCount: 0
};
var def = {
query: "cm:name:test*",
store: "workspace://SpacesStore",
language: "fts-alfresco",
sort: [sort1, sort2],
page: paging
};
var results = search.query(def);
You can find more information here: http://wiki.alfresco.com/wiki/4.0_JavaScript_API#Search_API

Resources