Synchronizing multiple audio sources over a network - networking

The problem in short form is the following:
I'm buffering (real-time analysis) multiple audio sources from different locations and I want to sync them with millisecond precision when I've retrieved them in my server.
What I've discovered so far:
The reason I'm doing this is to analyze sound that microphones are enclosing. For this to work NTP is implemented so that should be sufficient clock-synchronizing-wise.
Right now I'm able to access a RTP stream but is it possible to access a time-stamp in that stream? How should I buffer?
Other ideas are welcome if they are better ofcourse ;)

Related

How to acces the data from a website 50-100 times a second using raspberry Pi?

I want to fetch the data of a stock. Since the data changes very fast, is there any way to pull the data like 50-100 times a second from trading websites?
And can we implement that using a raspberry Pi 4 8gig model.
RasPi4 should be more than adequate for this task. Both the ethernet and WiFi hardware is capable of connections at these speeds. (Unless you’re running a bunch of other stuff on it.) Consider where your bottlenecks may be, likely ISP or other network traffic). Consider avoiding WiFi in favor of cat5e or cat6. Consider hanging this device off your router (edge) to keep lan traffic lower and consider QOS settings if you think this traffic may compete with other lan traffic.
This appears to be a general question with no specific platform in mind. For stocks, there are lots of platforms to choose from.
APIs for trading platforms often include a method to open a stream. Instead of a full TCP conversation for each price check, a stream tells the server to just keep on sending data. There are timeout mechanisms of course, but it is good to close that stream gracefully (It’s polite since you’re consuming server resources at a different scale. I’ve seen some financial APIs monitor and throttle stream subscribers who leave sessions open.).
For some APIs/languages you can find solid classes already built on GitHub. Although, if simply pulling and reading a stream then the platform API doc code snippets should be enough to get you going.
Be sure to find out what other overhead may be implicated. For example, if an account or API key is needed to open a stream then either a session must be opened first or the creds must be passed with the stream being opened. The API docs will say. If you’re new to this sort of thing, just be a detective and try to infer what is needed. API docs usually try to be precise and technically correct with the absolute minimum word count.
Simply checking the steam should be easy. Depending on how that steam can be handled by your code/script, it may be harder to perform logic on the stream while it is being updated. That’s usually a thread issue or a variable scope issue depending on the script/code. For what you’re doing I would consider Python or PowerShell depending on your skill-set and other design parameters.

Can gRPC be used for audio-video streaming over Internet?

I understand in a client-server model gRPC can do a bidirectional streaming of data.
I have not tried yet, but want to know will it be possible to stream audio and video data from a source to cloud server using gRPC and then broadcast to multiple client, all in real time ?
TLDR: I would not recommend video over gRPC. Primarily because it wasn't designed for it, so doing it would take a lot of hacking. You should probably take a look at WebRTC + a specific video codec.
More information below:
gRPC has no video compression
When sending video, we want to send things efficiently because sending it raw could require 1GB/s connectivity.
So we use video compression / video encoding. For example, H.264, VP8 or AV1.
Understand how video compression works (eg saving bandwidth by minimising similar data shared between frames in a video)
There is no video encoder for protobufs (the format used by gRPC).
You could then try image compression and save the images in a bytes field (e.g. bytes image_frame = 1;, but this is less efficient and definitely takes up unnecessary space for videos.
It's probably possible to encode frames into protobufs using a video encoder (e.g. H.264) and then decode them to play in applications. However, it might take a lot of hacking/engineering effort. This use case is not what gRPC/protobufs is designed for and not commonly done. Let me know if you hack something together, I would be curious.
gRPC is reliable
gRPC uses TCP (not UDP), which is reliable.
At a glance, reliability might be handy, to avoid corrupting data or lost data. However, depending on the use case (realtime video or audio calls), we may prefer to skip frames if they are dropped or delayed. The losses may be unnoticeable or painless to the user.
If the packet is delayed, it will wait for the packet before playing the rest. (aka. out of order delivery)
If the packet is dropped, it will resend it (aka. packet loss)
Therefore, video conferencing apps usually use WebRTC/RTP (configured to be unreliable)
Having mentioned that, looks like Zoom was able to implement Video-over-WebSockets, which is also a reliable transport (over TCP). So it's not "game-over", just highly not recommended and a lot more effort. They have moved over to WebRTC though.
Data received on the WebSockets goes into a WebAssembly (WASM) based decoder. Audio is fed to an AudioWorklet in browsers that support that. From there the decoded audio is played using the WebAudio “magic” destination node.

Sending http requests with cookies using ESP8266

I wrote this API :
https://github.com/prp-e/iot-api-temp-humid
and when I tested it, I used this command :
curl -b cookies.txt http://localhost:8000/login/username/password
and each time I wanted to check the data in the "Enviroment" table, I use
curl -c cookies.txt http://localhost:8000/env/username
I need the cookies to be stored somewhere, or regenerate each time ESP8266 sends data to the API. is there any way?
If the cookie data is small (fewer than 4096 bytes), you might store it using the EEPROM class. Note that the ESP8266 doesn't really have an EEPROM (Arduinos generally do), so this is just writing the data to a reserved area of its flash storage. Be sure to call EEPROM.commit() after you write or your changes won't actually be saved. The EEPROM documentation includes links to some examples of how to use it.
If the cookie data is larger, you can store it in a file using SPIFFS. SPIFFS lets you use part of the ESP8266's flash storage as a simple filesystem.
ESP8266 boards usually have low quality flash storage which can only handle at most a few hundred thousand writes, so you don't want to write to the flash very frequently. For instance, if you updated the cookies in flash once per second, in just one day you'd write to the flash 86,400 times. Within two days you'd quite possibly wear out the sector of flash that was being used to store cookie values. So be careful with how often you change the values of the cookies and how often you write to the flash memory.
The ESP8266 also has 512 bytes of RAM associated with its real time clock (RTC). Data stored here will persist across reboots but will be lost if power is removed from the chip. Because it's normal RAM and not flash, it doesn't suffer from wear problems and can be rewritten safely. Here's an example of how to use it.

Windows Filtering Platform - finding byte count of TCP sessions

I am using a Windows Filtering Platform callout on Windows to track TCP connections. Filters on the ALE established and endpoint closure layers work great for detecting start and end of connection. However, I also need to know the size of traffic in each direction and preferably packet count but I have not been able to find that in the closure information.
It is possible to monitor each packet using the stream layer(s) but maintaining a session table in kernel space and constantly updating sessions for each packet is not appealing as this is going to add a lot of overhead and complexity.
Anyone know how to efficiently get byte-count for TCP sessions using WFP on Windows? Alternative suggestions would also be welcome.
I also tried to solve this issue once but ended up with the following. It is valid for IPv4 only!
On FWPM_LAYER_ALE_FLOW_ESTABLISHED_V4 layer you can create your own context using FwpsFlowAssociateContext0 funtions and later, at DITNO_FIREWALL_STREAM_CALLOUT_V4 and DITNO_FIREWALL_DATAGRAM_DATA_CALLOUT_V4 layers, increment byte counters and save any metadata in your context structure.
Once flowDeleteFn is called it means flows is at end and you've got your counters ready. Memory used for the context must be released.
Any luck with in-kernel features to approach it by the way?

Is it possible to downsample an audio stream at runtime with Flash or FMS?

I'm no expert in audio, so if any of you folks are, I'd appreciate your insights on this.
My client has a handful of MP3 podcasts stored at a relatively high bit rate, and I'd like to be able to serve those files to her users at "different" bit rates depending on that user's credentials. (For example, if you're an authenticated user, you might get the full, unaltered stream, but if you're not, you'd get a lower-bit-rate version -- or at least a purposely tweaked lower-quality version than the original.)
Seems like there are two options: downsampling at the source and downsampling at the client. In this case, knowing of course that the source stream would arrive at the client at a high bit rate (and that there are considerations to be made about that, which I realize), I'd prefer to alter the stream at the client somehow, rather than on the server, for several reasons.
Is doing so possible with the Flash Player and ActionScript alone, at runtime (even with a third-party library), or does a scenario like this one require a server-based solution? If the latter, can Flash Media Server handle this requirement specifically? Again, I'd like to avoid using FMS if I can, since she doesn't really have the budget for it, but if that's the only option and it's really an option, I'm open to considering it.
Thanks in advance...
Note: Please don't question the sanity of the request -- I realize it might sound a bit strange, but the requirements are what they are. In that light, for purposes of answering the question, you can ignore the source and delivery path of the bits; all I'm really looking for is an explanation of whether (and ideally how) a Flash client can downsample an MP3 audio stream at runtime, irrespective of whether the audio's arriving over a network connection or being read directly from disk. Thanks much!
I'd prefer to alter the stream at the client somehow, rather than on the server, for several reasons.
Please elucidate the reasons, because resampling on the client end would normally be considered crazy: wasting bandwidth sending the higher-quality version to a user who cannot hear it, and risking a canny user ripping the higher-quality stream at it comes in through the network.
In any case the Flash Player doesn't give you the tools to process audio, only play it.
You shouldn't need FMS to process audio at the server end. You could have a server-side script that loaded the newly-uploaded podcasts and saved them back out as lower-bitrate files which could be served to lowly users via a normal web server. For Python see eg. PyMedia, py-lame; or even a shell script using lame or ffmpeg or something from the command line should be pretty easy to pull off.
If storage is at a premium, have you looked into AAC audio? I believe Flash 9 and 10 on desktop browsers will play it. AAC in my experience takes only half of the size of the comparable MP3 (i.e. a 80kbps AAC will sound the same as a 160kbps MP3).
As for playback quality, if I recall correctly there's audio playback settings in the Publish Settings section in the Flash editor. Wether or not the playback bitrate can be changed at runtime is something I'm not sure of.

Resources