URL dwell time in any programming language? - information-retrieval

Could you please give me some hints, websites, books or research papers that would explain how to calculate the URL dwell time.
in case you don't know what is dwell time : dwell time denotes the time which a user spends viewing a document after clicking a link on a search engine results page.
Thanks in advance

One crude way to do this on a page would be to use a small GET request on a timer, going to a server - an "I'm still here". The frequency of this would be a trade off. This would be relatively easy to do with jquery or a similar framework.
You would not know if it is actually in an abandoned tab or that it is open but not actually being looked at.
A sample for the client end (using jquery):
$session = Math.floor((1 + Math.random()) * 0x10000);
function still_alive() {
$url = $server_url + "/still_alive";
$.get($url, {location: location.href, session: $session});
}
// call it once to prime it
still_alive();
// Set it up on a timer
window.setTimeout(function() {
still_alive();
}, 1000);
1000 is the interval in milliseconds - so this is on a 1 second interval. $server_url is the server to register this at - I am adding "/still_alive" as an endpoint to register this at. $session - this can be some way of identifying the current session - set to something once when the page loads - it could be the result of a uuid function.
The next line is a Jquery GET request to that whole url. It is being passed a plain object - with the key location holding the url of the current location. It may be more appropriate to be a POST instead of a GET - but the principle is still the same.

Related

How to create an alert to notify an user when some amount % of threshold reached DailyAsyncApex Executions

On 2 occasions in the past month, we have managed to hit our daily limit on asynchronous apex executions. Salesforce temporarily increased our limit to 425000 but it will be scaled down to 250000 in a week's time. Once we reach the limit, a lot of the SF functions will fail and this has tremendously impacted both internal staff and external customers.
So to prevent this from happening in the future, we need to create some kind of alert in Salesforce to monitor our daily asynchronous apex method executions. Our maximum daily limit is 250000. The alert will need to create a P3 helpdesk ticket and notify couple of users say USER A and USER B once it reaches 70% threshold.
Kindly advise what is possible to achieve the same
Thanks & Regards,
Harjeet
There's a promising Limits method but it doesn't seem to work currently ("reserved for future use"): System.debug(Limits.getAsyncCalls() + ' / ' + Limits.getLimitAsyncCalls());
There's an idea you can upvote: https://success.salesforce.com/ideaView?id=0873A0000003VIFQA2 ;)
You could query SELECT COUNT() FROM AsyncApexJob WHERE ... but that sounds like a bad idea ;)
I think your best course of action is to use SF REST API. There's a "limits" resource you can fetch. You could do it from SF itself (bad idea because if you'd schedule it to run every hour then well, of course it will contribute to the limit consumption too ;)) or from some external app that'd connect to your SF...
https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/dome_limits.htm
You can quickly try it out for example in workbench.developerforce.com before you decide you do want to deep dive into coding it.
Of course if you have control over your batch jobs, queuable, schedulable & #future calls you could implement some rough counter of executions in a helper object for example... won't help you much if most of the jobs are coming from managed packages though...
Got 1 more idea but it's pretty hardcore - you should be able to make a REST API call from javascript. so you could create a simple VF page (even without any apex controller), put JS callout on it, have it check every 5 mins and do something if threshold is hit... But that means IT person would have to have this page open all the time (perhaps as a home page component)... Messy :)
I was having the exact same issue so I created a simple JsForce script in NodeJS to monitor the call to the /limits endpoint.
You can connect a Free Monitoring service like UpTimerobot.com or PingDom.com and get an email when you find the Word "Warning" >50% or "Error" > 80%.
async function getSfLimits() {
try {
//Let's login into salesforce
const login = await conn.login(SF_USERNAME, SF_PASSWORD+SF_SECURITY_TOKEN);
//Call the API
const sfLimits = await conn.requestGet('/services/data/v51.0/limits');
return sfLimits;
} catch(err) {
console.log(err);
}
}
https://github.com/carlosdevia/salesforcelimits

Google reCAPTCHA response success: false, no error codes

UPDATE: Google has recently updated their error message with an additional error code possibility: "timeout-or-duplicate".
This new error code seems to cover 99% of our previously mentioned mysterious
cases.
We are still left wondering why we get that many validation requests that are either timeouts or duplicates. Determinining this with certainty is likely to be impossible, but now I am just hoping that someone else has experienced something like it.
Disclaimer: I cross posted this to Google Groups, so apologies for spamming the ether for the ones of you who frequent both sites.
I am currently working on a page as part of a ASP.Net MVC application with a form that uses reCAPTCHA validation. The page currently has many daily users.
In my server side validation** of a reCAPTCHA response, for a while now, I have seen the case of the reCAPTCHA response having its success property set to false, but with an accompanying empty error code array.
Most of the requests pass validation, but some keep exhibiting this pattern.
So after doing some research online, I explored the two possible scenarios I could think of:
The validation has timed out and is no longer valid.
The user has already been validated using the response value, so they are rejected the second time.
After collecting data for a while, I have found that all cases of "Success: false, error codes: []" have either had the validation be rather old (ranging from 5 minutes to 10 days(!)), or it has been a case of a re-used response value, or sometimes a combination of the two.
Even after implementing client side prevention of double-clicking my submit-form button, a lot of double submits still seem to get through to the server side Google reCAPTCHA validation logic.
My data tells me that 1.6% (28) of all requests (1760) have failed with at least one of the above scenarios being true ("timeout" or "double submission").
Meanwhile, not a single request of the 1760 has failed where the error code array was not empty.
I just have a hard time imagining a practical use case where a ChallengeTimeStamp gets issued, and then after 10 days validation is attempted, server side.
My question is:
What could be the reason for a non-negligible percentage of all Google reCAPTCHA server side validation attempts to be either very old or a case of double submission?
**By "server side validation" I mean logic that looks like this:
public bool IsVerifiedUser(string captchaResponse, string endUserIp)
{
string apiUrl = ConfigurationManager.AppSettings["Google_Captcha_API"];
string secret = ConfigurationManager.AppSettings["Google_Captcha_SecretKey"];
using (var client = new HttpClient())
{
var parameters = new Dictionary<string, string>
{
{ "secret", secret },
{ "response", captchaResponse },
{ "remoteip", endUserIp },
};
var content = new FormUrlEncodedContent(parameters);
var response = client.PostAsync(apiUrl, content).Result;
var responseContent = response.Content.ReadAsStringAsync().Result;
GoogleCaptchaResponse googleCaptchaResponse = JsonConvert.DeserializeObject<GoogleCaptchaResponse>(responseContent);
if (googleCaptchaResponse.Success)
{
_dal.LogGoogleRecaptchaResponse(endUserIp, captchaResponse);
return true;
}
else
{
//Actual code ommitted
//Try to determine the cause of failure
//Look at googleCaptchaResponse.ErrorCodes array (this has been empty in all of the 28 cases of "success: false")
//Measure time between googleCaptchaResponse.ChallengeTimeStamp (which is UTC) and DateTime.UtcNow
//Check reCAPTCHAresponse against local database of previously used reCAPTCHAresponses to detect cases of double submission
return false;
}
}
}
Thank you in advance to anyone who has a clue and can perhaps shed some light on the subject.
You will get timeout-or-duplicate problem if your captcha is validated twice.
Save logs in a file in append mode and check if you are validating a Captcha twice.
Here is an example
$verifyResponse = file_get_contents('https://www.google.com/recaptcha/api/siteverify?secret='.$secret.'&response='.$_POST['g-recaptcha-response'])
file_put_contents( "logfile", $verifyResponse, FILE_APPEND );
Now read the content of logfile created above and check if captcha is verified twice
This is an interesting question, but it's going to be impossible to answer with any sort of certainly. I can give an educated guess about what's occurring.
As far as the old submissions go, that could simply be users leaving the page open in the browser and coming back later to finally submit. You can handle this scenario in a few different ways:
Set a meta refresh for the page, such that it will update itself after a defined period of time, and hopefully either get a new ReCAPTCHA validation code or at least prompt the user to verify the CAPTCHA again. However, this is less than ideal as it increases requests to your server and will blow out any work the user has done on the form. It's also very brute-force: it will simply refresh after a certain amount of time, regardless of whether the user is currently actively using the page or not.
Use a JavaScript timer to notify the user about the page timing out and then refresh. This is like #1, but with much more finesse. You can pop a warning dialog telling the user that they've left the page sitting too long and it will soon need to be refreshed, giving them time to finish up if they're actively using it. You can also check for user activity via events like onmousemove. If the user's not moving the mouse, it's very likely they aren't on the page.
Handle it server-side, by catching this scenario. I actually prefer this method the most as it's the most fluid, and honestly the easiest to achieve. When you get back success: false with no error codes, simply send the user back to the page, as if they had made a validation error in the form. Provide a message telling them that their CAPTCHA validation expired and they need to verify again. Then, all they have to do is verify and resubmit.
The double-submit issue is a perennial one that plagues all web developers. User behavior studies have shown that the vast majority occur because users have been trained to double-click icons, and as a result, think they need to double-click submit buttons as well. Some of it is impatience if something doesn't happen immediately on click. Regardless, the best thing you can do is implement JavaScript that disables the button on click, preventing a second click.

Record audio into mutil-small block each duration with getUserMedia

I think I should begin with an example of what I want:
User allow the web page to access mic, then start recording.
Every 3 seconds (for example) capture what the user say (maybe into an Blob).
Repeat until user want to stop.
I've found many example that use AudioContext.createScriptProcessor but it work by given a buffer size, I love to have similar thing but given a duration.
you can simply use recorderjs, and use it in the below mentioned fashion:
var rec = new Recorder(source);
rec.record();
var recInterval = setInterval(function(){
rec.exportWAV(function(blob){
rec.clear();
// do something with blob...
});
}, 3000); // 3000 - to get blob for every three seconds.
later on some button click, add rec.stop() to end .

Google Chrome restores session cookies after a crash, how to avoid?

On Google Chrome (I saw this with version 35 on Windows 8.1, so far I didn't try other versions) when browser crashes (or you simply unplug power cable...) you'll be asked to recover previous session when you'll open it again. Good feature but it will restore session cookies too.
I don't want to discuss here if it's a bug or not anyway IMO it's a moderate security bug because a user with physical access to that machine may "provoke" a crash to stole unclosed sessions with all their content (you won't be asked to login again).
Finally my question is: how a web-site can avoid this? If I'm using plain ASP.NET authentication with session cookies I do not want they survive to a browser crash (even if computer is restarted!).
There is not something similar to a process ID in the User Agent string and JavaScript variables are all restored (so I can't store a random seed, generated - for example - server side). Is there anything else viable? Session timeout will handle this but usually it's pretty long and there will be an unsafe window I would eliminate.
I didn't find anything I can use as process id to be sure Chrome has not been restarted but there is a dirty workaround: if I setup a timer (let's say with an interval of five seconds) I can check how much time elapsed from last tick. If elapsed time is too long then session has been recovered and logout performed. Roughly something like this (for each page):
var lastTickTime = new Date();
setInterval(function () {
var currentTickTime = new Date();
// Difference is arbitrary and shouldn't be too small, here I suppose
// a 5 seconds timer with a maximum delay of 10 seconds.
if ((currentTickTime - lastTickTime) / 1000 > 10) {
// Perform logout
}
lastTickTime = currentTickTime;
}, 5000);
Of course it's not a perfect solution (because a malicious attacker may handle this and/or disable JavaScript) but so far it's better than nothing.
New answers with a better solution are more than welcome.
Adriano's suggestion makes is a good idea but the implementation is flawed. We need to remember the time from before the crash so we can compare it to the time after the crash. The easiest way to do that is to use sessionStorage.
const CRASH_DETECT_THRESHOLD_IN_MILLISECONDS = 10000;
const marker = parseInt(sessionStorage.getItem('crashDetectMarker') || new Date().valueOf());
const diff = new Date().valueOf() - marker;
console.log('diff', diff)
if (diff > CRASH_DETECT_THRESHOLD_IN_MILLISECONDS) {
alert('log out');
} else {
alert ('ok');
}
setInterval(() => {
sessionStorage.setItem('crashDetectMarker', new Date().valueOf());
}, 1000)
To test, you can simulate a Chrome crash by entering chrome://crash in the location bar.
Don't forget to clear out the crashDetectMarker when the user logs out.

WordPress Write Cache Issue with Multiple Sessions

I'm working on a content dripper custom plugin in WordPress that my client asked me to build. He says he wants it to catch a page view event, and if it's the right time of day (24 hours since last post), to pull from a resource file and output another post. He needed it to also raise a flag and prevent other sessions from firing that same snippet of code. So, raise some kind of flag saying, "I'm posting that post, go away other process," and then it makes that post and releases the flag again.
However, the strangest thing is occurring when placed under load with multiple sessions hitting the site with page views. It's firing instead of one post -- it's randomly doing like 1, 2, or 3 extra posts, with each one thinking that it was the right time to post because it was 24 hours past the time of the last post. Because it's somewhat random, I'm guessing that the problem is some kind of write caching where the other sessions don't see the raised flag just yet until a couple microseconds pass.
The plugin was raising the "flag" by simply writing to the wp_options table with the update_option() API in WordPress. The other user sessions were supposed to read that value with get_option() and see the flag, and then not run that piece of code that creates the post because a given session was already doing it. Then, when done, I lower the flag and the other sessions continue as normal.
But what it's doing is letting those other sessions in.
To make this work, I was using add_action('loop_start','checkToAddContent'). The odd thing about that function though is that it's called more than once on a page, and in fact some plugins may call it. I don't know if there's a better event to hook. Even still, even if I find an event to hook that only runs once on a page view, I still have multiple sessions to contend with (different users who may view the page at the same time) and I want only one given session to trigger the content post when the post is due on the schedule.
I'm wondering if there are any WordPress plugin devs out there who could suggest another event hook to latch on to, and to figure out another way to raise a flag that all sessions would see. I mean, I could use the shared memory API in PHP, but many hosting plans have that disabled. Can't use a cookie or session var because that's only one single session. About the only thing that might work across hosting plans would be to drop a file as a flag, instead. If the file is present, then one session has the flag. If the file is not present, then other sessions can attempt to get the flag. Sure, I could use the file route, but it's kind of immature in my opinion and I was wondering if there's something in WordPress I could do.
The key may be to create a semaphore record in the database for the "drip" event.
Warning - consider the following pseudocode - I'm not looking up the functions.
When the post is queried, use a SQL statement like
$ts = get_time_now(); // or whatever the function is
$sid = session_id();
INSERT INTO table (postcategory, timestamp, sessionid)
VALUES ("$category", $ts, "$sid")
WHERE NOT EXISTS (SELECT 1 FROM table WHERE postcategory = "$category"
AND timestamp < $ts - 24 hours)
Database integrity will make this atomic so only one record can be inserted.
and the insertion will only take place if the timespan has been exceeded.
Then immediately check to see if the current session_id() and timestamp are yours. If they are, drip.
SELECT sessionid FROM table
WHERE postcategory = "$postcategory"
AND timestamp = $ts
AND sessionid = "$sid"
The problem goes like this with page requests even from the same session (same visitor), but also can occur with page requests from separate visitors. It works like this:
If you are doing content dripping, then a page request is probably what you intercept with add_action('wp','myPageRequest'). From there, if a scheduled post is due, then you create the new post.
The post takes a little bit of time to write to the database. In that time, a query on get_posts() may not see that new record yet. It may actually trigger your piece of code to create a new post when one has already been placed.
The fix is to force WordPress to flush the write cache appears to be this:
try {
$asPosts = array();
$asPosts = # wp_get_recent_posts(1);
foreach($asPosts as $asPost) {break;}
# delete_post_meta($asPost['ID'], '_thwart');
# add_post_meta($asPost['ID'], '_thwart', '' . date('Y-m-d H:i:s'));
} catch (Exception $e) {}
$asPosts = array();
$asPosts = # wp_get_recent_posts(1);
foreach($asPosts as $asPost) {break;}
$sLastPostDate = '';
# $sLastPostDate = $asPost['post_date'];
$sLastPostDate = substr($sLastPostDate, 0, strpos($sLastPostDate, ' '));
$sNow = date('Y-m-d H:i:s');
$sNow = substr($sNow, 0, strpos($sNow, ' '));
if ($sLastPostDate != $sNow) {
// No post today, so go ahead and post your new blog post.
// Place that code here.
}
The first thing we do is get the most recent post. But we don't really care if it's not the most recent post or not. All we're getting it for is to get a single Post ID, and then we add a hidden custom field (thus the underscore it begins with) called
_thwart
...as in, thwart the write cache by posting some data to the database that's not too CPU heavy.
Once that is in place, we then also use wp_get_recent_posts(1) yet again so that we can see if the most recent post is not today's date. If not, then we are clear to drip some content in. (Or, if you want to only drip in like every 72 hours, etc., you can change this a little here.)

Resources