Repeating email notifications in Bosun - bosun

in notification of Bosun configuration, I set the timeout as 5m i.e. 5 minutes.
I am receiving emails for the interval of either 5 minutes or 10 minutes.
I am not able to debug as why is this happening.
Please help.
notification default {
email =jon#rohit.com
print = true
next = default
timeout = 5m
}
template tags {
subject =`Testing Emails Sample`
body = `<p><strong>Tags</strong>
<table>
{{range $k, $v := .Group}}
{{if eq $k "api"}}
<tr><td>{{$k}} : {{$v}}</td></tr>
{{end}}
{{end}}
</table>`
}
alert testSampleAlert5 {
template = tags
$notes = This alert monitors the percentage of 5XX on arm APIs
crit = 1
warn = 0
warnNotification = default
critNotification = default
}
alert testSampleAlert4 {
template = tags
$notes = This alert monitors the percentage of 5XX on arm APIs
crit = 0
warn = 1
warnNotification = default
critNotification = default
}

What you are encountering is bosun's feature of "chained notifications". The next and timeout parameters specify that the default notification will be triggered again after 5 minutes as you have configured it. Since it references itself, it will trigger every 5 minutes until the alert is acknowledged or closed.
You have a few options if this is not what you want.
Acknowledge the alert on the dashboard. This will stop all notification chains from repeating.
Remove the next parameter if you do not want to be re-notified every 5 minutes, or increase the timeout, or something like that.
What is your desired behaviour?

Related

How to use SignalR to notify app user`s that the system in deployment process right now

Is there a way to notify system`s users on real-time that the system is in deployment process(publish to production)?The purpose is to prevent them from starting to do atomic operations?
the system is an ASP.NET-based system and it already has SignalR Dlls, but I do not exactly know how to get to the "source" in the application from which I know that the system is deploying right now.
This is highly dependent on your deployment process, but I achieved something similar in the following way:
I created a method in one of my controllers called AnnounceUpdate:
[HttpPost("announce-update")]
public async Task<IActionResult> AnnounceUpdate([FromQuery] int secondsUntilUpdate, string updateToken)
{
await _tenantService.AnnounceUpdate(secondsUntilUpdate, updateToken);
return Ok();
}
The controller method takes in the amount of seconds till the update, as well as a secret token to ensure not just anyone can call this endpoint.
The idea is that we will call this controller just before we deploy, to announce the pending deployment. I make my deployments using Azure Dev Ops, and so I was able to create a release task that automatically runs the following PowerShell code to call my endpoint:
$domain = $env:LOCALURL;
$updateToken = $env:UPDATETOKEN;
$minutesTillUpdate = 5;
$secondsUntilUpdate = $minutesTillUpdate * 60;
$len = $secondsUntilUpdate / 10;
#notify users every 10 seconds about update
for($num =1; $num -le $len; $num++)
{
$url = "$domain/api/v1/Tenant/announce-update?secondsUntilUpdate=$secondsUntilUpdate&updateToken=$updateToken";
$r = Invoke-WebRequest $url -Method Post -UseBasicParsing;
$minsLeft = [math]::Floor($secondsUntilUpdate/60);
$secsLeft = $secondsUntilUpdate - $minsLeft * 60;
$timeLeft;
if($minsLeft -eq 0){
$timeLeft = "$secsLeft seconds";
}else{
if($secsLeft -eq 0){
$timeLeft = "$minsLeft minute(s)";
}else{
$timeLeft = "$minsLeft minute(s) $secsLeft seconds";
}
};
$code = $r.StatusCode;
Write-Output "";
Write-Output "Notified users $num/$len times.";
Write-Output "Response: $code.";
Write-Output "$timeLeft remaining."
Write-Output "_________________________________"
Start-Sleep -Seconds 10;
$secondsUntilUpdate = $secondsUntilUpdate - 10;
}
Write-Output "Allowing users to log out.";
Write-Output "";
Start-Sleep -Seconds 1;
Write-Output "Users notfied! Proceeding with update.";
As you can see, on the script I have set that the time till the update is 5 minutes. I then call my AnnounceUpdate endpoint every 10 seconds for the duration of the 5 minutes. I have done this because if I announce an update that will occur in 5 minutes, and then 2 minutes later someone connects, they will not see the update message. On the client side I set a variable called updatePending to true when the client receives the update notification, so that they do not keep on getting a message every 10 seconds. Only clients that have not yet seen the update message will get it.
In the tenant service I then have this code:
public async Task AnnounceUpdate(int secondsUntilUpdate, string updateToken)
{
if (updateToken != _apiSettings.UpdateToken) throw new ApiException("Invalid update token");
await _realTimeHubWrapper.AnnouncePendingUpdate(secondsUntilUpdate);
}
I simply check if the token is valid and then conitnue to call my HUB Wrapper.
The hub wrapper is an implementation of signalR's hub context, which allows to invoke signalR methods from within our code. More info can be read here
In the HUB wrapper, I have the following method:
public Task AnnouncePendingUpdate(int secondsUntilUpdate) =>
_hubContext.Clients.All.SendAsync("UpdatePending", secondsUntilUpdate);
On the client side I have set up this handler:
// When an update is on the way, clients will be notified every 10 seconds.
private listenForUpdateAnnouncements() {
this.hubConnection.on(
'PendingUpdate', (secondsUntilUpdate: number) => {
if (!this.updatePending) {
const updateTime = currentTimeString(true, secondsUntilUpdate);
const msToUpdate = secondsUntilUpdate * 1000;
const message =
secondsUntilUpdate < 60
? `The LMS will update in ${secondsUntilUpdate} seconds.
\n\nPlease save your work and close this page to avoid any loss of data.`
: `The LMS is ready for an update.
\n\nThe update will start at ${updateTime}.
\n\nPlease save your work and close this page to avoid any loss of data.`;
this.toastService.showWarning(message, msToUpdate);
this.updatePending = true;
setTimeout(() => {
this.authService.logout(true, null, true);
this.stopConnection();
}, msToUpdate);
}
}
);
}
I show a toast message to the client, notifying them of the update. I then set a timeout (using the value of secondsUntilUpdate) which will log the user out and stop the connection. This was specifically for my use case. You can do whatever you want at this point
To sum it up, the logical flow is:
PowerShell Script -> Controller -> Service -> Hub Wrapper -> Client
The main take away is that somehow we need to still trigger the call to the endpoint to announce the update. I am lucky enough to be able to have it run automatically during my release process. If you are manually publishing and copying the published code, perhaps you can just run the PowerShell script manually, and then deploy when it's done?

Blazor Connection Disconnected

I have a web app with Client running on Blazor Server. I have set a custom Blazor reconnect modal (<div id="components-reconnect-modal"...) as the documentation says here - Microsoft Docs
Also I have these settings for SignalR and Blazor circuits:
services.AddServerSideBlazor
(options =>
{
options.DisconnectedCircuitMaxRetained = 100;
options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromMinutes(5);
options.JSInteropDefaultCallTimeout = TimeSpan.FromMinutes(1);
options.MaxBufferedUnacknowledgedRenderBatches = 10;
})
.AddHubOptions(options =>
{
options.ClientTimeoutInterval = TimeSpan.FromSeconds(30);
options.EnableDetailedErrors = false;
options.HandshakeTimeout = TimeSpan.FromSeconds(15);
options.KeepAliveInterval = TimeSpan.FromSeconds(15);
options.MaximumReceiveMessageSize = 32 * 1024;
options.StreamBufferCapacity = 10;
});
But I have an annoying problem - Whenever the app is open in a browser tab and stays still with nobody using it it disconnects. It happens very inconsistently and I can't locate the configuration for these custom time periods but I need to enlarge them. Example:
Initialy loaded the app at 11:34:24AM
Leave it like that for a while
In the Console: "Information: Connection disconnected." at 11:55:48AM and my reconnect-modal appears.
How can I enlarge the lifetime of the connection so that it is always bigger than my session timeout. I checked the Private Memory Limit of my app pool but it is unlimited. It happens really inconsistently with the same steps to reproduce. Test 1 - 16 mins 20 sec; Test 2 - 21 mins 58 sec; Test 3 - 34 mins 56 sec...and then after iisreset...Test 4 - 6 mins 28 sec
Please help.
Apparently this is by design. Once the client is idle for some time it essentially stops sending pings to the server. Then the server only knows tha the client has disconnected and drops the details to allow for resource reuse. You can edit the times and count of “inactive” connections as you have done in the CircuitOptions, but at some point if the client is idle it will disconnect.
have a look at:
https://learn.microsoft.com/en-us/aspnet/core/blazor/state-management?view=aspnetcore-3.1&pivots=server
I think that you can go around this error by using this code in _Host.cshtml:
<script>
Blazor.start().then(() => {
Blazor.defaultReconnectionHandler._reconnectionDisplay = {
show: () => {},
update: (d) => {},
rejected: (d) => document.location.reload()
};
});
</script>
Please read also about Modify the reconnection handler (Blazor Server)
See also this answer
Are you hosting on IIS?
Is the default IIS App Pool recycling set to 20 mins of Idle?
You could try and set this to 0.

MOAR process ballooning while running Perl6 socket server

I have a socket server using IO::Socket::Async and Redis::Async for message publishing. Whenever there is a message received by the server, the script would translate the message and generate acknowledge message to be sent back to the sender so that the sender would send subsequent messages. Since translating the message is quite expensive, the script would run that portion under a 'start' method.
However, I noticed that the Moar process eating my RAM as the script is running. Any thought where should I look to solve this issue? Thanks!
https://pastebin.com/ySsQsMFH
use v6;
use Data::Dump;
use experimental :pack;
use JSON::Tiny;
use Redis::Async;
constant $SOCKET_PORT = 7000;
constant $SOCKET_ADDR = '0.0.0.0';
constant $REDIS_PORT = 6379;
constant $REDIS_ADDR = '127.0.0.1';
constant $REDIS_AUTH = 'xxxxxxxx';
constant $IDLING_PERIOD_MIN = 180 - 2; # 3 minutes - 2 secs
constant $CACHE_EXPIRE_IN = 86400; # 24h hours
# create socket
my $socket = IO::Socket::Async.listen($SOCKET_ADDR, $SOCKET_PORT);
# connnect to Redis ...
my $redis;
try {
my $error-code = "110";
$redis = Redis::Async.new("$SOCKET_ADDR:$SOCKET_PORT");
$redis.auth($REDIS_AUTH);
CATCH {
default {
say "Error $error-code ", .^name, ': Failed to initiate connection to Redis';
exit;
}
}
}
# react whenever there is connection
react {
whenever $socket -> $conn {
# do something when the connection wants to talk
whenever $conn.Supply(:bin) {
# only process if data length is either 108 or 116
if $_.decode('utf-8').chars == 108 or $_.decode('utf-8').chars == 116 {
say "Received --> "~$_.decode('utf-8');
my $ack = generateAck($_.decode('utf-8')); # generate ack based on received data
if $ack {
$conn.print: $ack;
}else{
say "No ack. Received data maybe corrupted. Closing connection";
$conn.close;
}
}
}
}
CATCH {
default {
say .^name, ': ', .Str;
say "handled in $?LINE";
}
}
}
### other subroutines down here ###
The issue was using the Async::Redis. Jonathon Stowe had fixed the Redis module so I'm using Redis module with no issue.

How to properly connect to FCM?

I wrote the code as it is in the Firebase tutorial page and upon app start it seem to connect without any problem, I get the message:
2016-10-15 19:27:16.945596 engTrain iOS[910:444452] Connected to FCM.
and I receive both messages when app in the fron and notifications when app is in the background.
But when I test a In App Purchase(I'm in debug yet) I get
2016-10-15 19:36:38.356075 engTrain iOS[910:444452] Unable to connect to FCM. Error Domain=com.google.fcm Code=2001 "(null)"
The analytics debug is logging the following data about an _iap event but I can't see any event with this name in my console.
**2016-10-15 19:38:58.825 engTrain iOS[910:] <FIRAnalytics/DEBUG> Event is not subject to real-time event count daily limit. Marking an event as real-time. Event name, parameters: _iap, {
"_c" = 1;
"_dbg" = 1;
"_o" = auto;
"_r" = 1;
currency = JPY;
price = 600000000;
"product_id" = "pro_user";
"product_name" = Pro;
quantity = 1;
value = 600000000;
}**
Has anybody experienced something similar? Is this normal in debug or should I fix something before production.
Thank you.

Not getting alerts but acknowledgement of the alerts on dashboard work

I have been trying to get Bosun to work with little success. Here are my problems:
1) I am able to see alerts appearing in my dashboard, but the alerts never come thru to the notification mode of my choice, be it e-mail, slack or json.
2) When I acknowledge the alerts on the dashboard, only one notification from the notification chain (first one) will be received. I.e. If I set up {email -> slack -> json}, only e-mail notification will be received, no slack and json.
Any help will be appreciated. Below is my dev.config
-------------- dev.conf ---------------
tsdbHost = qa1-sjc005-031:4242
emailFrom = bosun-alert#noreply.com
smtpHost = stmp.somedomain.com:25
checkFrequency = 1m
httpListen = :8070
# Post to an endpoint
notification json {
post = http://somedomain.com/HealthCheck/bosunAlert
body = {"text": {{.|json}}}
contentType = application/json
print = true
next = json
timeout = 5m
}
# Post to a Slack channel via Incoming Webhooks integration
notification slack {
post = https://hooks.slack.com/services/T03DNM0UU/B04QH37J6/ypn0
Uy2JwLa676soomXwItjq
body = payload={"channel": "#testing", "username": "webhookbot"
, "text" : "This is a test!"}
print = true
next = json
timeout = 5m
}
# Send out e-mail notification
notification email {
email = username#somedomain.com
print = true
next = slack
timeout = 5m
}
template test {
subject = {{.Last.Status}}: {{.Alert.Name}} on {{.Group.measurem
ent}} for {{.Group.pod}}
body = `<p>Name: {{.Alert.Name}}
<p>Tags:
<table>
{{range $k, $v := .Group}}
<tr><td>{{$k}}</td><td>{{$v}}</td></tr>
{{end}}
</table>`
}
alert test {
template = test
crit = avg(q("avg:mq1{measurement=*,pod=pod3}", "1h", ""))
warn = avg(q("avg:mq1{measurement=*,pod=pod3}", "30m", ""))
critNotification = email
warnNotification = email
}
Acknowledgements stop notification chains. So the purpose of them is really for the escalation of incidents; escalation stops when an incident has been acknowledged. As per the documentation on notifications:
notification
A notification is a chained action to perform. The chaining continues until the chain ends or the alert is acknowledged.
It seems like you might want to send to multiple notifications at the same time. To do this list them out in warnNotification and/or critNotification as appropriate:
critNotification: comma-separated list of notifications to trigger on critical.
Both the quotes are from the configuration documentation (documentation isn't that well organized currently, so I don't blame you for missing either of these)

Resources