I have an asynchronous F# function in an AWS Lambda. However, when testing it, parts of the function are never called. The function is always terminated after 10 seconds, even though the aws-lambda-tools-defaults.json file specifies the timeout to 30 seconds.
Here is the code:
namespace MyProject
open FSharp.Data
type SecretsJson = JsonProvider<"./Resources/Secrets.json", RootName="Secret">
module ApiClient =
let testGet (secrets: SecretsJson.Secret) =
async {
printfn "%s" "3. This is sometimes printed out and sometimes not."
let! test = Http.AsyncRequestString("https://postman-echo.com/get?foo1=bar1&foo2=bar2")
printfn "%s" "4. This is never printed."
return ""
}
namespace MyProject
open Amazon.Lambda.Core
open Amazon
open Amazon.S3
open Amazon.S3.Util
open System.IO
open Amazon.S3.Model
open Amazon.SecretsManager.Extensions.Caching
[<assembly: LambdaSerializer(typeof<Amazon.Lambda.Serialization.SystemTextJson.DefaultLambdaJsonSerializer>)>]
()
type Function() =
member __.FunctionHandler (input: S3EventNotification) (_: ILambdaContext) : System.Threading.Tasks.Task<unit> =
printfn "%s" "1. This gets printed always."
async {
use client = new AmazonS3Client(RegionEndpoint.EUWest1)
use secretsCache = new SecretsManagerCache()
let! secretsString = secretsCache.GetSecretString "secretsKey" |> Async.AwaitTask
printfn "%s" "2. Also this is always printed."
let secrets = SecretsJson.Parse(secretsString)
let! response = ApiClient.testGet secrets
printfn "%s" "5. This is never printed."
} |> Async.StartAsTask
I don't understand this behaviour. I thought that the |> Async.StartAsTask expression will ensure that the whole async block in the Function.FunctionHandler gets executed as a standard C# Task<T>. Or do I have to convert each of my Async<'a> functions into a Task<T>? Or is there a different error in my code that I don't see?
The async code shouldn't be the source of the problem. I tried the following:
open System
open Amazon.Lambda.Core
open FSharp.Data
let loadStuff (context: ILambdaContext) (i:int) =
async {
let! test = Http.AsyncRequestString("https://www.google.com")
sprintf "i:%i Length: %i" i test.Length
|> context.Logger.LogLine
}
and
type Functions() =
member __.Get (request: APIGatewayProxyRequest) (context: ILambdaContext) =
async {
do! [1..10]
|>Seq.map (Loader.loadStuff context)
|>Async.Parallel
|>Async.Ignore
do! Async.Sleep(10000)
do! [11..15]
|>Seq.map (Loader.loadStuff context)
|>Async.Parallel
|>Async.Ignore
}
|> Async.StartAsTask
This takes 12 seconds and outputs:
i:6 Length: 47974
i:3 Length: 47201
i:4 Length: 47200
i:8 Length: 47255
i:5 Length: 47183
i:1 Length: 47145
i:7 Length: 47203
i:9 Length: 47177
i:10 Length: 47202
i:2 Length: 47198
i:14 Length: 47201
i:12 Length: 47155
i:11 Length: 47250
i:13 Length: 47162
i:15 Length: 47130
Consider adding try catch blocks to your code and also using context.Logger.Logline instead of printfn. Also if you have very long running pieces of work e.g. >30secs. Consider breaking them down into more functions and possibly coordinating them with something like serverless workflows.
I have a strange problem. A process gets frozen/stuck while reading data using System.Data.SqlClient.SqlDataReader on GetValue() function. I am analyzing the process dump using WinDbg. I used SOS commands like !dlk, !SyncBlk, !analyze -v -hang etc. but none of them indicate any deadlocks.
The last call on callstack is
000000001a98e8a8 0000000076febd7a [InlinedCallFrame: 000000001a98e8a8] .**SNIReadSyncOverAsync**(SNI_ConnWrapper*, SNI_Packet**, Int32)
000000001a98e8a8 000007fee9e8bca1 [InlinedCallFrame: 000000001a98e8a8] .SNIReadSyncOverAsync(SNI_ConnWrapper*, SNI_Packet**, Int32)
000000001a98e880 000007fee9e8bca1 DomainBoundILStubClass.IL_STUB_PInvoke(SNI_ConnWrapper*, SNI_Packet**, Int32)
000000001a98e950 000007fee9e7254f SNINativeMethodWrapper.SNIReadSyncOverAsync(System.Runtime.InteropServices.SafeHandle, IntPtr ByRef, Int32)
000000001a98e9c0 000007fee9e7226e System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
000000001a98ea70 000007fee9e72180 System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
000000001a98eab0 000007fee9e72950 System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
000000001a98eae0 000007fee9e728d8 System.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Byte[], Int32, Int32, Int32 ByRef)
000000001a98eb50 000007feea4c0adc System.Data.SqlClient.TdsParser.TryReadSqlValue(System.Data.SqlClient.SqlBuffer, System.Data.SqlClient.SqlMetaDataPriv, Int32, System.Data.SqlClient.TdsParserStateObject)
000000001a98ec10 000007fee9e85cb5 System.Data.SqlClient.SqlDataReader.TryReadColumnInternal(Int32, Boolean)
000000001a98ecb0 000007fee9e85af0 System.Data.SqlClient.SqlDataReader.TryReadColumn(Int32, Boolean, Boolean)
000000001a98ed20 000007fee9e85a0b System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32)
000000001a98ed60 000007fee9e859b7 System.Data.SqlClient.SqlDataReader.GetValue(Int32)
What are other avenues for further debugging here? Did someone face the same issue with SqlDataReader?
I recently found out when this kind of stack trace is found, there is always a Win32 error in dump like this.
0:009> !gle
LastErrorValue: (Win32) 0x3e5 (997) - Overlapped I/O operation is in progress.
LastStatusValue: (NTSTATUS) 0xc0000034 - Object Name not found.
Does this mean it is blocked due to some underlying IO operation?
I'm trying to make a recursive call to handle_call/3 before replying.
But it seems to not be possible, because a timeout exit exception is thrown.
Below you can see the code and the error.
The code:
-module(test).
-behavior(gen_server).
%% API
-export([start_link/0,init/1,handle_info/2,first_call/1,second_call/1, handle_call/3, terminate/2]).
-record(state, {whatever}).
start_link() ->
gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).
init(_Args) ->
{ok, #state{}}.
handle_info(Data, State) ->
{noreply, State}.
% synchronous messages
handle_call(_Request, _From, State) ->
case _Request of
first_call ->
{reply, data1, State};
second_call ->
{reply, {data2, first_call(self())}, State}
end.
first_call(Pid) ->
gen_server:call(Pid, first_call).
second_call(Pid) ->
gen_server:call(Pid, second_call).
terminate(_Reason, _State) ->
ok.
The error:
2> {_, PID} = test:start_link().
{ok,<0.64.0>}
3> test:second_call(PID).
** exception exit: {timeout,{gen_server,call,[<0.64.0>,second_call]}}
in function gen_server:call/2 (gen_server.erl, line 204)
A possible reason for this behaviour is that gen_server is not able to handle recursive calls until the first one is done (creating a deadlock). Is this the right reason, and if yes why?
Thanks.
Yes, that is the reason. The gen_server is a single process, and while handle_call is executing it doesn't pick up any messages nor respond to any gen_server:call. Therefore first_call(self()) times out.
Consider the following (based on sockserv from LYSE)
%%% The supervisor in charge of all the socket acceptors.
-module(tcpsocket_sup).
-behaviour(supervisor).
-export([start_link/0, start_socket/0]).
-export([init/1]).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) ->
{ok, Port} = application:get_env(my_app,tcpPort),
{ok, ListenSocket} = gen_tcp:listen(
Port,
[binary, {packet, 0}, {reuseaddr, true}, {active, true} ]),
lager:info(io_lib:format("Listening for TCP on port ~p", [Port])),
spawn_link(fun empty_listeners/0),
{ok, {{simple_one_for_one, 60, 3600},
[{socket,
{tcpserver, start_link, [ListenSocket]},
temporary, 1000, worker, [tcpserver]}
]}}.
start_socket() ->
supervisor:start_child(?MODULE, []).%,
empty_listeners() ->
[start_socket() || _ <- lists:seq(1,20)],
ok.
%%%-------------------------------------------------------------------
%%% #author mylesmcdonnell
%%% #copyright (C) 2015, <COMPANY>
%%% #doc
%%%
%%% #end
%%% Created : 06. Feb 2015 07:49
%%%-------------------------------------------------------------------
-module(tcpserver).
-author("mylesmcdonnell").
-behaviour(gen_server).
-record(state, {
next,
socket}).
-export([start_link/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, code_change/3, terminate/2]).
-define(SOCK(Msg), {tcp, _Port, Msg}).
-define(TIME, 800).
-define(EXP, 50).
start_link(Socket) ->
gen_server:start_link(?MODULE, Socket, []).
init(Socket) ->
gen_server:cast(self(), accept),
{ok, #state{socket=Socket}}.
handle_call(_E, _From, State) ->
{noreply, State}.
handle_cast(accept, S = #state{socket=ListenSocket}) ->
{ok, AcceptSocket} = gen_tcp:accept(ListenSocket),
kvstore_tcpsocket_sup:start_socket(),
receive
{tcp, Socket, <<"store",Value/binary>>} ->
Uid = kvstore:store(Value),
send(Socket,Uid);
{tcp, Socket, <<"retrieve",Key/binary>>} ->
case kvstore:retrieve(binary_to_list(Key)) of
[{_, Value}|_] ->
send(Socket,Value);
_ ->
send(Socket,<<>>)
end;
{tcp, Socket, _} ->
send(Socket, "INVALID_MSG")
end,
{noreply, S#state{socket=AcceptSocket, next=name}}.
handle_info(_, S) ->
{noreply, S}.
code_change(_OldVsn, State, _Extra) ->
{ok, State}.
terminate(normal, _State) ->
ok;
terminate(_Reason, _State) ->
lager:info("terminate reason: ~p~n", [_Reason]).
send(Socket, Bin) ->
ok = gen_tcp:send(Socket, Bin),
ok = gen_tcp:close(Socket),
ok.
I'm unclear on how each tcpserver process is terminated? Is this leaking processes?
I don't see any place that you are terminating the owning process.
I think what you are looking for are four cases:
The client terminates the connection (you receive tcp_closed)
The connection goes wonky (you receive tcp_error)
The server receives a system message to terminate (this could, of course, just be the supervisor killing it, or a kill message)
The client sends a message telling the server its done and you want to do some clean up other than just reacting to tcp_closed.
The most common case is usually the client just closes the connection, and for that you want something like:
handle_info({tcp_closed, _}, State) ->
{stop, normal, State};
The connection getting weird is always a possibility. I can't think of any time I want to have the owning process or the socket stick around, so:
%% You might want to log something here.
handle_info({tcp_error, _}, State) ->
{stop, normal, State};
And any case where the client tells the server its done and you need to do cleanup based on the client having done something successful (maybe you have resources open that should be written to first, or a pending DB transaction open, or whatever) you would want to expect a success message from the client that closes the connection the way your send/2 does, and returns {stop, normal, State} to halt the process.
The key here is making sure you identify the cases where you want to end the connection and either have the server process killed or (better) return {stop, Reason, State}.
As written above, if you intend send/2 to be a single response and a clean exit (or really, that every accept cast should result in a single send/2 and then termination), then you want:
handle_cast(accept, S = #state{socket=ListenSocket}) ->
{ok, AcceptSocket} = gen_tcp:accept(ListenSocket),
kvstore_tcpsocket_sup:start_socket(),
receive
%% stuff that results in a call to send/2 in any case.
end,
{stop, normal, S}.
The case LYSE demonstrates is one where the connection is persistent and there is ongoing back-and-forth between a client and server. In the case above you are handling a single request, spawning a new listener to re-fill the listener pool, and should be exiting because you have no plan of this gen_server doing any further work.
I am trying to diagnose a memory leak, and I'm getting some strange numbers, perhaps someone can point out why I'm seeing what I'm seeing.
The first step that I performed, was attempting to see where the leak was occurring, in managed or unmanaged space.
I profiled the process, and got the following chart
According to the various documentation on leak diagnostics, I should see either, the private bytes run away while the "all heaps" don't (indicating an unmanaged leak) or they both run away in parallel, indicating a managed one.
It appears I do have a leak - (chart is CPU+Private Bytes+Managed Heaps).
What is puzzling me - is - why do my managed heaps consume only about 30MB between 9am and just before 5pm (but private bytes grow), then all of a sudden - BOOM my managed heaps jump to 3 gigs consumed?
Why would this happen?
UPDATE:
654cf3d8 199671 6389472 System.Web.HttpCacheValidateHandler
719c25e8 559507 6714084 System.Object
654b82e8 95499 6875928 System.Web.HttpServerVarsCollection
05e90a24 253641 7101948 System.Web.Mvc.NameValueCollectionValueProvider+ValueProviderResultPlaceholder+<>c__DisplayClass8
654e42e4 97208 7776640 System.Web.HttpWriter
04c2a5c8 264802 8473664 Castle.MicroKernel.BurdenReleaseDelegate
04c2ab68 264813 9533268 Castle.MicroKernel.Burden
06bde0a8 507282 10145640 System.Lazy`1[[System.Web.Mvc.ValueProviderResult, System.Web.Mvc]]
6fb5348c 267697 10707880 System.Collections.Generic.HashSet`1[[System.String, mscorlib]]
654e9388 160209 11535048 System.Web.HttpHeaderCollection
654ad44c 194416 12442624 System.Web.HttpCookieCollection
6fd1abbc 170480 14202840 System.Collections.Generic.HashSet`1+Slot[[System.String, mscorlib]][]
654b2204 95203 15613292 System.Web.HttpCachePolicy
06bde010 507282 16233024 System.Func`1[[System.Web.Mvc.ValueProviderResult, System.Web.Mvc]]
719c3a6c 469961 18106904 System.Int32[]
654e87e4 97208 18275104 System.Web.Hosting.IIS7WorkerRequest
654e2590 97208 19441600 System.Web.HttpRequest
654e285c 97208 19830432 System.Web.HttpResponse
715fbc80 422170 20264160 System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[System.Object, mscorlib]]
654e2160 97208 23329920 System.Web.HttpContext
654e9614 388836 23330160 System.Web.HttpValueCollection
719c45c8 919071 47791692 System.Collections.Hashtable
654d5220 4808083 115393992 System.Web.HttpServerVarsCollectionEntry
719bfc20 4849839 116396136 System.Collections.ArrayList
719c4584 105080 119191278 System.Byte[]
70d45bec 9064979 145039664 System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry
719afe88 5391401 175028320 System.Object[]
719c5ed4 919078 237147240 System.Collections.Hashtable+bucket[]
719c2248 7055089 454532758 System.String
Ok, so I've run windbg over the crash dump, (!dumpheap -live -stat) and I've found that there are LOT of objects relating to the Http context, which are still held in memory (98,000 after a typical work day in fact).
Can anyone confirm... I shouldn't be seeing this right? There are Types occurring 97,208 times in the log - This means that the HttpRequest/HttpResponse, etc are being held in memory, causing ALOT of leakage.
What could be causing this? I know they're not being stored in the session..my session is set to default timeout, and when inspecting, it only contains 3 small string objects.
Figured it out. Running GCroot highlighted the problem. See how there is a Castle.MicroKernel.Releasers.LifecycledComponentsReleasePolicy in the reference list?
Castle wasn't being told to release the controller after the request had completed. Mystery solved.
0:000> !gcroot 95963d2c
HandleTable:
00bb12f0 (pinned handle)
-> 03062490 System.Object[]
-> 021501dc Castle.Windsor.WindsorContainer
-> 02150200 Castle.MicroKernel.DefaultKernel
-> 02150304 System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[Castle.MicroKernel.ISubSystem, Castle.Windsor]]
-> 02150af8 System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[Castle.MicroKernel.ISubSystem, Castle.Windsor]][]
-> 02150b74 Castle.Windsor.Diagnostics.DefaultDiagnosticsSubSystem
-> 02150b8c System.Collections.Generic.List`1[[Castle.Windsor.Diagnostics.IContainerDebuggerExtension, Castle.Windsor]]
-> 02150d00 System.Object[]
-> 02150d30 Castle.Windsor.Diagnostics.Extensions.ReleasePolicyTrackedObjects
-> 02150d3c Castle.Windsor.Diagnostics.TrackedComponentsDiagnostic
-> 02150e04 System.EventHandler`1[[Castle.Windsor.Diagnostics.TrackedInstancesEventArgs, Castle.Windsor]]
-> 02150d54 Castle.MicroKernel.Releasers.LifecycledComponentsReleasePolicy
-> 02150d84 System.Collections.Generic.Dictionary`2[[System.Object, mscorlib],[Castle.MicroKernel.Burden, Castle.Windsor]]
-> 038da530 System.Collections.Generic.Dictionary`2+Entry[[System.Object, mscorlib],[Castle.MicroKernel.Burden, Castle.Windsor]][]
-> 9596f3a4 WebController
-> 9596f9cc System.Web.Mvc.ControllerContext
-> 95965b5c System.Web.HttpContextWrapper
-> 95964078 System.Web.HttpContext
-> 95963d2c System.Web.Hosting.IIS7WorkerRequest