HonoJs: Best way to start a Twitter SDK connection in Hono with CloudFlare? - cloudflare-workers

In an old-school server environment, you initialize an SDK (like the Twitter SDK) when the server starts up, using dotenv to read secrets and tokens from your .env file like so:
import dotenv from 'dotenv';
import {Client} from 'twitter-api-sdk';
dotenv.config();
const twitterClient = new Client (TWITTER_SECRET_INFO);
And then you would use the twitterClient object to get data in one of the route handlers.
What's the best practice for initializing something like the twitter client in Hono with Cloudflare?
In the old service worker framework, I could have treated the secret info as a global environment variable much like in Node/Express, but in the new module worker code you have to access the environment variables as a parameter passed to a function call. It looks like Hono manages this by passing contexts to methods like .use/.get/.post.
Ideally, though, I wouldn't reinitialize the twitter connection on every request, especially since I'm just getting public info with a token, not dealing with any user login/password info.
Is there any way to do this in Hono/Cloudflare, or do I have to initialize the Twitter client middle ware each request? I looked at the Hono class constructer, but from what I can tell, all it does is take a router config object.
And from what I can tell of the cloudflare docs, module workers have the same issue. Whereas constants in a service worker were declared outside the route handler, it looks like everything in a module worker is declared inside a fetch handler. Is there anyway to initialize once during the life of the worker and not for each request?

In principle you could initialize the client on the first request:
let twitterClient = null;
export default {
async fetch(req, env, ctx) {
if (!twitterClient) {
twitterClient = new Client(env.TWITTER_SECRET_INFO);
}
// ... normal code ...
}
}
That said, though, is creating a new client actually expensive?
Constructing the client does not "initialize a connection". The client presumably makes requests by calling fetch(). The fetch() API doesn't expose any way to control the underlying connections used; each fetch() operates effectively independently. But, the Workers Runtime will automatically reuse connections behind the scenes, when possible. It could even reuse the same connection for two completely unrelated Workers, if they are contacting the same destination host. So it may be that even creating a new client with every request, you're already getting good connection reuse.
That said, perhaps the client has to do some sort of key exchange upfront, e.g. exchanging a long-lived refresh token for an access token. That is annoying to have to repeat on every request. So in that sense, maybe caching it in a global helps.
However, note that Workers creates LOTS of instances of your Worker around the world. You may find if you curl your Worker several times in a row, each request lands on a different instance. You may find that caching in global state does not actually have much impact unless you have a large amount of traffic.
Caching may be more effective if you use the Cache API to store cached values into the colo-wide cache. Unfortunately, client libraries designed for Node environments may not provide the right hooks to do this.
One final note: Note that putting live resources (things that are not just plain data structures) into the global scope can be dangerous on Workers, because in general a Promise created on behalf of one incoming request cannot be awaited in the context of some other request. So if that twitter client does do some sort of upfront key exchange and tries to have all requests wait for that to complete, you may find that if you receive multiple requests at once before the initial key exchange finishes, all except the first request end up failing. To be honest, I would recommend creating a new client for every request unless you see a measurable performance problem from this.

Related

How to access dependency injection container in Symfony 4 without actual injection?

I've got a project written in Symfony 4 (can update to the latest version if needed). In it I have a situation similar to this:
There is a controller which sends requests to an external system. It goes through records in the DB and sends a request for every row. To do that there is an MagicApiConnector class which connects to the external system, and for every request there is a XxxRequest class (like FooRequest, BarRequest, etc).
So, something like this general:
foreach ( $allRows as $row ) {
$request = new FooRequest($row['a'], $row['b']);
$connector->send($request);
}
Now in order to do all the parameter filling magic, the requests need to access a service which is defined in Symfony's DI. The controller itself neither knows nor cares about this service, but the requests need it.
How can my request classes access this service? I don't want to set it as a dependency of the controller - I could, but it kinda seems awkward, as the controller really doesn't care about it and would only pass it through. It's an implementation detail of the request, and I feel like it shouldn't burden the users of the request with this boilerplate requirement.
Then again, sometimes you need to make a sacrifice in the name of the greater good, so perhaps this is one of those cases? It feels like I'm "going against the grain" and haven't grasped some ideological concept.
Added: OK, the full gory details, no simplification.
This all is happening in the context of two homebrew systems. Let's call them OldApp and NewApp. Both are APIs and NewApp is calling into the OldApp. The APIs are simple REST/JSON style. OldApp is not built on Symfony (mostly even doesn't use a framework), the NewApp is. My question is about NewApp.
The authentication for OldApp APIs comes in three different flavors and might get more in the future if needed (it's not yet dead!) Different API calls use different authentication methods; sometimes even the same API call can be used with different methods (depending on who is calling it). All these authentication methods are also homebrew. One uses POST fields, another uses custom HTTP headers, don't remember about the third.
Now, NewApp is being called by an Android app which is distributed to many users. Android app actually uses both NewApp and OldApp. When it calls NewApp it passes along extra HTTP headers with authentication data for OldApp (method 1). Thus NewApp can impersonate the Android app user for OldApp. In addition, NewApp also needs to use a special command of OldApp that users themselves cannot call (a question of privilege). Therefore it uses a different authentication mechanism (method 2) for that command. The parameters for that command are stored in local configuration (environment variables).
Before me, a colleague had created the scheme of a APIConnector and APICommand where you get the connector as a dependency and create command instances as needed. The connector actually performs the HTTP request; the commands tell it what POST fields and what headers to send. I wish to keep this scheme.
But now how do the different authentication mechanisms fit into this? Each command should be able to pass what it needs to the connector; and the mechanisms should be reusable for multiple commands. But one needs access to the incoming request, the other needs access to configuration parameters. And neither is instantiated through DI. How to do this elegantly?
This sounds like a job for factories.
function action(MyRequestFactory $requestFactory)
{
foreach ( $allRows as $row ) {
$request = $requestFactory->createFoo($row['a'], $row['b']);
$connector->send($request);
}
The factory itself as a service and injected into the controller as part of the normal Symfony design. Whatever additional services that are needed will be injected into the factory. The factory in turn can provide whatever services the individual requests might happen to need as it creates the request.

About OWIN pipeline

I have a simple question regarding the OWIN pipeline. I pretty much understand the whole concept of this specification, but there is something that i haven't totally digested.
According to several online posts, there is the OWIN pipeline which consist of several developer-defined modules (or middleware components) and which is constucted by the owin Host. Then there is the server which will listen to requests and pass them over = through the pipeline of OWIN components.
The point that i don't totally understand is why do we need to have a pipeline. So for example, lets imagine that in thes StartUp class we have something like:
public class Startup
{
public void Configuration(IAppBuilder app)
{
app.Use<CustomMiddleware>(new CustomComponent());
var config = new HubConfiguration { EnableCrossDomain = true };
app.MapHubs(config);
string exeFolder = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
string webFolder = Path.Combine(exeFolder, "Web");
app.UseStaticFiles(webFolder);
}
}
In the above example we ask the OWIN Host to construct a pipeline of three OWIN middleware components. From what i understand, the server will forward the request (probably wrapped in a Dictionary) to the first component in that pipeline, which in turn will do some task and pass it over to the next component and so forth.
I wonder why would we need to get all the components involved in each request; For example, if we ask for a static html page only, why not only bother the component that deals with static files; i mean why such a request need the participation of the Web Api for example.
I think i've cleared that out. It turns out that the request doesn't have to move through the whole pipeline. It is the responsibility of each component in the pipeline to decide if they can deal with the request or if they want to forward it to the next node;
You are correct in your answer that middlewares may opt out of handling the request. Many Katana middleware implementations favor a "soft 404" approach in which the middleware will interpret a 404 to mean "try the next middleware". The first middleware to complete the Task will halt further propagation and complete the response message. You can optimize the path by inserting middlewares in the most likely order for performance or common use.
The middleware component are executed in order, and it usually makes sense to run through all the components in the pipeline, until one of them decides to cut the pipeline by not calling the next one.
This is possible because each middleware component has a reference to the next one, and the environment, a dictionary of objects, whose key is a string, who gives acces to all the necessary things to check the request, create the response, and do many other things. (using a dictionary is clever, because it's easy to add any kind of information, or executable code, which isn't known beforehand).
Each component can do the following things, but all of them are optional:
execute code before calling the next component (usually checking and modifying the environment dictionary)
call the next middleware component of the pipeline (which will do the same)
execute some code after the next component finishes executing
throw an exception (intentionally or because of un unhandled error)
The pipeline execution will finish on one of this cases:
a middleware component doesn't call the next one
a middleware component throws an exception
NOTE: each "Middleware component" is an implementation of the Application Delegate
Having a bunch of components makes sense, if they're are registered in the right order. A pipeline could look something like this:
authentication
authorization
caching
execution of some kind of API or HTML generation
Each component has enough information to execute the code its responsible for. For example:
authentication will alwasy be executed, and set in the dictionay the information of the principal. Then it will cal lthe next one: authorization component
depending on the request, and the result of the authenticacion, the authorization component will decide if the principal has permissions for executing the request, and call the next component (caching), or reject the request
the caching can also decide if it can return cached results, or has to call the next component
the last component will execute, and will probably create the response, and it will not call any other component. The call stack will be run in reverse order, giving each component the possibility to do some estra work. For example the caching component could cache the response, and the next ones (authorization and authentication) will simply return without executing any extra code.
As each component has all the request, response, context and any other desired information, they all have enough information to decide if they have to do something, modify the dictonay, cal lthe next one or retunr...
So, as you can see, registering a lot of middleware components doesn't require all them to be executed for each request, but it sometimes make sense.
OTOH executing a middleware component can be really cheap. For example, authentication could simply execute the next component if the request doesn't require authorization. And the same with the cahing component, which could simply call the next component if caching is not reuired.
So even if a component it's in the pipeline it can have a cheap execution, or it can even not run at all depending on the request.
In your particular question of static files, and Web API, one of the components will be resgitered before the other. The first one, whichever is, will execute and create the response, or simply call the next one, depending on the request. E.g. if the static files is registered before the other one, if a file is requested the static file will create the response, and will not call the Web API. If the request isn't for a file, it will do nothing but calling the next componente: the Web API component.

node.js asynchronous initialization issue

I am creating a node.js module which communicates with a program through XML-RPC. The API for this program changed recently after a certain version. For this reason, when a client is created (createClient) I want to ask the program its version (through XML-RPC) and base my API definitions on that.
The problem with this is that, because I do the above asynchronously, there exists a possibility that the work has not finished before the client is actually used. In other words:
var client = program.createClient();
client.doSomething();
doSomething() will fail because the API definitions have not been set, I imagine because HTTP XML-RPC response has not returned from the program.
What are some ways to remedy this? I want to be able to have a variable named client and work with that, as later I will be calling methods on it to get information (which will be returned via a callback).
Set it up this way:
program.createClient(function (client) {
client.doSomething()
})
Any time there is IO, it must be async. Another approach to this would be with a promise/future/coroutine type thing, but imo, just learning to love the callback is best :)

Passing HTTP authenticated principal onto another worker thread

We have a web front end on our business layer server.
Certain pages in our web application instantiate very long running tasks (could be up to 10+ minutes). The way that these requests are handled is like so: -
(on the HTTP request thread)
we make a connection to the business server.
we create a new thread to make the long running call passing in the connection object.
The HTTP request then completes, passing a handle back to the browser,
the browser periodically polls the web server to get updates on the long running task progress.
All requests to the business server are authenticated - the connection's user principal page must have permission to call the method on the business server.
This mechanism works fine as long as our web application is running in Classic mode.
When we run in pipeline mode, we get ObjectDisposedExceptions when the browser polls.
System.ObjectDisposedException: Safe handle has been closed
at System.StubHelpers.StubHelpers.SafeHandleC2NHelper(Object pThis, IntPtr CleanupWorkList)
at Microsoft.Win32.Win32Native.GetTokenInformation(SafeTokenHandle TokenHandle, UInt32 TokenInformationClass, SafeLocalAllocHandle TokenInformation, UInt32 TokenInformationLength, ref UInt32 ReturnLength)
at System.Security.Principal.WindowsIdentity.GetTokenInformation(SafeTokenHandle tokenHandle, TokenInformationClass tokenInformationClass, ref UInt32 dwLength)
at System.Security.Principal.WindowsIdentity.get_User()
at System.Security.Principal.WindowsIdentity.GetName()
at System.Security.Principal.WindowsIdentity.get_Name()
the problem appears to be that the windows principal used to make the connection is disposed when the original request ends (which is understandable - in fact I am surprised that the code worked at all!).
As a way around this problem I was wondering if it was possible to either create a duplicate of the HTTP request principal and use that to create the connection (and dispose of it when the long running task completes) or would it be possible to impersonate the HTTP request principle on the worker thread even after the principal is disposed?
Update
(My comment under Aliostad's question was incorrect: the test page did fail. I managed to confuse myself sufficiently that I wrote my test page so that it did not exercise the same code path as the real (faulting) code. Nevermind!)
I have written a "workaround" for this problem: -
I am in the fortunate position of knowing what roles/groups the business server logic will be querying for before the call to the business server is made. So my workaround is to create a new generic principal based upon the request's principal's membership of these roles. The long running task is run using the generic principal.
I am not 100% happy with this workaround because it is very much a "hack" - i.e. I can see that it would easily fall down if some logic did the (eminently sensible) check of verifying that the principal's identity is authenticated.
So I would still very much appreciate any help / insight into this issue.
Thanks
OK, here is my catch on this.
First of all, if you create a thread, all the current thread's security context will be copied to the new thread - by default. This operation is heavy but much needed (as you can imagine most things will not work without it). In case you need to prevent it and you do not need the copying of context, there is a way to do it and it has been explained in Richter's C# via CLR. Lucky enough, he has shared this very bit of the book here and basically calling a static method to prevent context to be flowed:
ExecutionContext.SuppressFlow();
I cannot think this is being called in WCF although using Reflector, I found a single use of it in here:
[SecuritySafeCritical]
private IAsyncResult BeginGetContext(bool startListening)
{
Exception exception;
do
{
exception = null;
try
{
try
{
if (ExecutionContext.IsFlowSuppressed())
{
return this.listener.BeginGetContext(this.onGetContext, null);
}
using (ExecutionContext.SuppressFlow())
{
return this.listener.BeginGetContext(this.onGetContext, null);
}
}
// .... the rest
Interestingly enough, this is used in 3 places one of them in SharedHttpTransportManager.
Now all this might look like we have found the issue and it is a bug but I very much doubt it.
My hunch is that there is a process recycling happening in between and the context is lost. The way to prove or disprove this would be to use perfmon to register all process recycles and find out if any was in between.
My solution is basically - which you might not like! - to simply insert an item into a queue (MSMQ or a simple database queue) and have a windows service reading it. With this operation being so important, I would never trust IIS to carry out to the finish.
Hope this is useful to you.

HttpSession Session ID different to FlexSession ID

I have a Flex application which is served via a JSP page. In this page I output the session ID using HttpSession when the page is loaded:
System.out.println("Session ID: " + session.getId());
In a very simple remote object hosted in BlazeDS (called from the flex application using an AMF Channel and standard RemoteObject functionality) I also output the session ID but this time using FlexSession (which as I understand is supposed to wrap around HttpSession).
System.out.println("FlexSession ID: " + FlexContext.getFlexSession().getId());
I would expect both IDs to be the same but this is not the case. The session IDs differ which is causing problems as there is data stored in the HttpSession which I need to be able to access from my remote objects within BlazeDS.
I've exhausted the reading material on BlazeDS and FlexClient/FlexSession/FlexContext but can't see why the FlexSession is not being linked to the HttpSession. Any pointers greatly appreciated.
I feel I must be missing something fundemental here, am I accessing the
I do not think that it is related to the FlashPlayer..is more related to the concept of FlexSession and how BlazeDS/LCDS works. For example you can have an active session even when not using the http channels - when using NIO/RTMP you are bypassing the application server and the http protocol. So it make sense to have an abstract class FlexSession with various implementations.
However when using BlazeDS FlexSession will wrap an HttpSession object internally, and removeAttribute/getAttribute/setAttribute are in fact calling the the same methods from the HttpSession object..so you can access all the data from the HttpSession. If not please provide more details.
However, it will not work when using RTMP channels(which exists only in LCDS by the way), you need to change your design in this case.
Thanks to both answers above I finally found the root cause and thought I'd share it on here.
The reason for differing session IDs was to do with the use of SSL for authentication and the use of AMF Channel rather than Secure AMF. Using the channel for the first time caused a new session to be created (hence the different ID) as the existing session related to the secure version of the site.
Silly configuration mistake but worth passing on - make sure that if using SSL that you're also using Secure AMF connecting to the secure endpoint rather than standard AMF or you'll run into the same session ID problems I faced.
Unfortunately this is just how the Flash player works. I have seen this same behavior many times.
The best solution I found was to establish the HTTP session and pass back the session ID. On the client side, you can then pass the session ID to the Flex application. You then send that ID from Flash to the server and use it to look up the existing session or establish a second session.
You will need to do something like this though, I have not been able to find a way to reliably get Flash to use the same session.

Resources