Get all leads programmatically in Marketo v1 - marketo

I would like to get all the leads that a customer has in Marketo.
I understand that you can Get Multiple Leads by Filter Type REST API endpoint.
If I do not have access to their Marketo UI, how should I get all the leads?
I was thinking about querying 300 ids at a time until there were no more results. But I am unsure about how to handle if all leads in a 300 batch are deleted, but there are leads after that deleted batch. Are deleted leads returned?

I'll describe a workaround you can use to determine the last lead that was created in Marketo using the REST API. You can use this lead as the upper bound for lead id, and then query leads 300 at a time until you reach this upper bound as you described.
The workaround is to use the Get Lead Activities API to return activities for most recent leads that were created. By calling this API, you can determine the last lead that was created in Marketo, and then use it as your upper bound.
Here are some tips for calling the Get Lead Activities API:
Specify an activityTypeIds=12 parameter to return
activities for new leads.
Include a paging token parameter as the start date to
look for the most recent leads that were created. To generate a paging token, you will need to use
the Get Paging Token API.
To optimize this, start with a time range that is close to the
current date. For example, first query the Get Lead Activities API
for leads created in the past hour. Then if there are no results,
query for the past day, and so on.
Iterate through the results from the Get Lead Activities
API until the moreResult attribute in the response is false. The last
lead returned will be the upper bound for leads ids.
For example, a call to the Get Lead Activities API will look like:
/rest/v1/activities.json?nextPageToken=GIYDAOBNGEYS2MBWKQYDAORQGA5DAMBOGAYDAKZQGAYDALBQ&activityTypeIds=12

Related

Azure Cosmos DB: is it always necessary to check for HasMoreResults?

All examples for querying Azure Cosmos DB with .NET C# check FeedIterator<T>.HasMoreResults before calling FeedIterator<T>.ReadNextAsync().
Considering that the default MaxItemCount is 100, and knowing for a fact that the query will return fewer items than 100, is it necessary to check for HasMoreResults?
Consider this example, which returns an integer:
var query = container.GetItemQueryIterator<int>("SELECT VALUE COUNT(1) FROM c");
int count = (await query.ReadNextAsync()).SingleOrDefault();
Is it necessary to check for HasMoreResults?
If your query can yield/terminate sooner, like the aggregations, then probably there is no more pages and HasMoreResults = false.
But the reason to always check HasMoreResults is because, in most cases, the SDK prefetches the next pages in memory while you are consuming the current one. If you don't drain all the pages, then these objects stay in memory. With time, memory footprint might increase (until eventually they get garbage collected but that can also consume CPU).
In cross-partition queries, it is common to see users make wrong assumptions, like, assuming all results of the query will be in 1 page (the first) and that can be true depending on which physical partition the data is stored, and it is very common in such cases users complaining that they had some code running for some time perfectly fine and then suddenly it stopped working (their data is now on another partition and not returning in the first page).
In some cases, the service might need to yield due to execution time going over the max time.
So, to avoid all these pitfalls (and others), the general recommendation is to loop until HasMoreResults = false. You won't be iterating more than it is required for each query, sometimes it will be one page, sometimes it might be more.
Source: https://learn.microsoft.com/azure/cosmos-db/nosql/query/pagination#understanding-query-executions
As a developer, we need to perform HasMoreResults Boolean check on DocumentClient object. If HasMoreResults is true then we can get more records by calling ExecuteNext method.
Also supporting comment by Mark Brown, it is best practice to check for HasMoreResults.
For managing results returned from quires Cosmosdb uses a continuation strategy. Each query submitted to Cosmos Db will have MaxItemCount Limit attribute and default limit value is 100.
The response of requests exceeding the MaxItemCount will get paginated and in response header continuation token will be present, which shows first partial page is returned and more records are available. Next pages can be retrieved by passing continuation token to subsequent calls.

EmberFire Relationship Persistance

Using EmberFire, I'm trying to work with related sets of data. In this case, a Campaign has many Players and Players have many Campaigns.
When I want to add a player to campaign, I understand that I can push the player object to campaign.players, save the player, and then save the campaign. This will update both records so that the relationship is cemented. This works fine.
My question is a hypothetical on how to handle failures when saving one or both records.
For instance, how would you handle an occasion whereby saving the player record succeeded (thus adding a corresponding campaign ID to its campaigns field), but saving the campaign then failed (thus failing to add the player to its players field). It seems like in this case you'd open yourself up to the potential of some very messy data.
I was considering taking a "snapshot" of both records in question and then resetting them to their previous states if one update fails, but this seems like it's going to create some semi-nightmarish code.
Thoughts?
I guess you are using the Real Time Database. If you use the update() method with different paths "you can perform simultaneous updates to multiple locations in the JSON tree with a single call to update()"
Simultaneous updates made this way are atomic: either all updates
succeed or all updates fail.
From the documentation: https://firebase.google.com/docs/database/web/read-and-write#update_specific_fields
So, in your case, you should do something like the following but many variations are possible as soon as the updates object holds all the simultaneous updates:
var updates = {};
updates['/campaigns/players/' + newPlayerKey] = playerData;
updates['/players/campaigns/' + campaignKey] = "true";
return firebase.database().ref().update(updates);

How to get status of only 1 lead in a program with GetLeadsByProgramID endpoint?

I am using the GetLeadsByProgramID REST API endpoint to get the Leads with Status under a Program in Marketo. But is there any way where I can get the status of only 1 lead for a program?
First, an advice:
As Marketo applies some limits for accessing the API (most importatntly: Daily Quota, Rate Limit, Concurrency Limit), it is considered to be a good practice to fetch as many records as you can with one API call and cache the results. You can allways loop through and filter out the result set as needed.
A solution:
With that said, you can still fetch the program status of one particular lead, but with not the GetLeadsByProgramID endpoint. Unfortunately that endpoint does not allow filtering based on lead Id.
The program status change of a lead is also an activity, and luckily there is an endpoint, Get Lead Activities to query just that. You need to have four things before making the call:
A paging token –obtained from the Get Paging Token endpoint–, that also defines the earliest datetime to retrieve activities from.
The Id of the “Change Status in Progression” activity type, which can be gathered from the Get Activity Types endpoint. It is 104 in my case, but this is not guaranteed to be the same in all instances.
The Id of your Lead object in question. I assume you have that on record.
The Id of the Program you checking statuses for. I guess you have that on record too. It can be fetched via the API too, but that is also present in the url when you click on the program in your instance. E.g.: if your link is https://app-abc01.marketo.com/#ME1234A1, the Program Id is 1234.
So, having all that info at hand you can make the call as described at the Activities Endpoint Reference page. In essence this is the url you have to call:
GET /rest/v1/activities.json?nextPageToken=<YOUR_NEXPAGE_TOKEN>&activityTypeIds=104&leadIds=<LEAD_ID>&assetIds=<PROGRAM_ID>
The response will contain all the program status changes of the Lead in the given Program after the given datetime. So you still might need to perform a loop in case there are multiple status changes.
You can decide if all this worth the effort.

Graphite null json data responses, and more frequent datapoints than expected

I am having troubles interpreting the data coming back from render API on some metrics I have set up in Graphite. I have a splunk query populating the results, and graphite is configured to poll this query which has a 1 minute range every one minute. The results I am getting back when doing a render api call with format=json are not making much sense to me. If I add into the api call a from=-1min, I get 2 datapoint results, 4 for -2min, etc. Along with this, often the datapoints are just given a null value which should never happen given how populated this splunk query is when I run it manually for the same time ranges.
I am wondering if there is any additional documentation on the render api other than what is listed here: https://graphite.readthedocs.io/en/0.9.15/render_api.html because it really isn't giving me much on why the data is appearing the way it is. Does anyone know of additional docs, or have any insight into what is going on here?

Alfresco CMIS different result with same query

we have a bit of a problem.
We've builded a GWT application on top of our two Alfresco instances. The application should work like this:
User search a document
Our web app spam two same queries against two repositories, wait for both results and expose a merged resultset.
This is true in case the search is for a specific documento (number id for example) or 10, 20, 50 documents (we don't know when this begins to act strange).
If the query is a consistent one (like all documents from last month, there should be about 30-60k/month) obviously the limit of cmis query (500) stops before.
BUT, if the user hits "search" the first time, after a while, the resultset is composed of 2 documents. And if the users hits "search" right after that again, with the same query, the resultset is exposed almost immediately and there are 500 documents listed.
What the heck is wrong? Does CMIS caches results in some way? How do big CMIS queries work?
Thanks
A.
As you mentioned you're using Apache Chemistry. Chemistry has a clientside caching mechanism:
http://chemistry.apache.org/java/how-to/how-to-tune-perfomance.html
I suspect this is not CMIS related at all but is instead due to the Alfresco Lucene "max permission check" problem. At a high-level, there is a config setting for the maximum number of permission checks that Alfresco will do against a search result set. There is also a limit to the total amount of time it will spend performing such checks. These limits are configured in the repository properties file as:
# The maximum time spent pruning results
system.acl.maxPermissionCheckTimeMillis=10000
# The maximum number of results to perform permission checks against
system.acl.maxPermissionChecks=1000
The first time you run a search the server begins performing these checks and hits the limit. It then returns the search results it was able to filter. Now the permission cache is populated so the next time you run the search the results come back much faster and the result set is larger.
Searches in Alfresco are non-deterministic--you cannot guarantee that, for large result sets, you will get back the exact same result set every time, regardless of how big you make those settings.
If you are able to upgrade at some point you may find that configuring Alfresco to use Solr rather than Lucene could help alleviate this, but I'm not 100% sure it will.
To disable security checks replace public SearchService with searchService. Public services have enforced security so with searchService you can avoid security checking.

Resources