How to adjust a sound clip's volume in real time using DShow.h and strmiids.lib with C++ - volume

I am trying to figure out how to set the volume in in real time that my sound clips play at in my C++ program, and do things like make the volume of the sound increase as 2 objects move closer to one another. Right now, I am using "DShow.h" as well as "strmiids.lib", and I am using the interface provided by the following data member pointers:
IGraphBuilder* m_graphBuilder;
IMediaControl* m_mediaControl;
IMediaEvent* m_mediaEvent;
IMediaSeeking* m_mediaSeeking;
Using the interface provided by these, is there a way to alter the volume of the media stream playing?

Have a look at the IBasicAudio interface.

Related

How are you supposed to update a texture per frame in Vulkan?

I'm trying to work with 2D in vulkan along with 3D. So right now testing out updating a texture for every frame as whatever 2D is going on. I've gotten something of a texture updater working, the problem is that it's very slow and probably not the way it's supposed to be done. Is there any better way of getting this done? The code is based on the https://vulkan-tutorial.com/ code.
https://vulkan-tutorial.com/code/26_depth_buffering.cpp
void UpdateTexture()
{
vkDeviceWaitIdle(device);
vkFreeMemory(device, textureImageMemory, nullptr);
VkBuffer stagingBuffer;
VkDeviceMemory stagingBufferMemory;
createBuffer(imageSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer, stagingBufferMemory);
void* data;
vkMapMemory(device, stagingBufferMemory, 0, imageSize, 0, &data);
memcpy(data, pixel2.data(), static_cast<size_t>(imageSize));
vkUnmapMemory(device, stagingBufferMemory);
createImage(texWidth, texHeight, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_TILING_OPTIMAL, VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, textureImage, textureImageMemory);
transitionImageLayout(textureImage, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
copyBufferToImage(stagingBuffer, textureImage, static_cast<uint32_t>(texWidth), static_cast<uint32_t>(texHeight));
transitionImageLayout(textureImage, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
vkDestroyBuffer(device, stagingBuffer, nullptr);
vkFreeMemory(device, stagingBufferMemory, nullptr);
createTextureImageView();
createDescriptorPool();
createDescriptorSets();
createCommandBuffers();
}
This code looks like a direct translation of some OpenGL code, and not particularly good/modern OpenGL code at that.
There's a lot wrong in this code, but most of it boils down to over-synchronization.
First, you should always view any call to vkDeviceWaitIdle as the wrong thing to do. The only exception would be when you are preparing to destroy the VkDevice itself. There is no other reason to do a full CPU/GPU sync like that.
Presumably, this synchronization exists so that you can be sure the GPU is finished using the image before modifying it. This is the wrong thing to do. You should instead employ multiple-buffering. That is, you should have two images that you use. One is currently being used in a rendering process, while the other is being transferred into.
Instead of doing a full device sync, you instead synchronize with the batch you sent two frames ago. That is, if you're wanting to transfer data for use by frame 10, then you must first do a fence-sync operation with the batch you sent in frame 8. Frame 9 is still being processed, but frame 8 is probably done by now. So the synchronization shouldn't hurt too much.
Second, never allocate memory in the middle of an operation like this. Memory gets allocated early in your application, and you leave it allocated until it's time to destroy your application. If you need a staging buffer, then keep it around and reuse it in subsequent frames. Make sure to allocate sufficient storage up-front.
Whatever your createBuffer call is doing, it seems very much like a bad idea. Vulkan is not OpenGL; Vulkan separated memory from buffers/textures that use it for a reason. Creating APIs that hide this separation basically throws all of that away.
Similarly, never unmap memory, unless you're about to destroy that memory object. There's no problem in Vulkan (or OpenGL) with leaving a piece of memory mapped indefinitely. Just map the entire memory's range and leave it mapped. Indeed, you could just pass the mapped pointer directly to your image loader, depending on how the memory get written by the image loading code (if it tries to read data from this pointer, they could be trouble).
Lastly, the commands doing the transfer need to be synchronized with the commands that consume the image. How this happens depends on which queues are being used to do the transfer.
And of course, if you want optimal performance, you may want to check to see if your implementation can read from linear images in your shader. If it can, then you may not need staging at all; you can just write the data directly to the memory in Vulkan's image format, and use it directly.
Employing all of the above is going to add a lot of complexity to your application. But that's how it's supposed to work.
A naive way consists in using the CPU to define the update depending on the time or data and then update the data for the shader, such as a MVP transformation matrix. But this is inefficient with lots of syncing and too low refresh rates, and also overloading the cpu in a loop.
So people recommend using many buffers sometimes mentioning old drivers. If someone can clarify it, that would be nice. I have a naive and probably wrong guess. If they know exactly the frame rate, then they can calculate the time for each frame and dispatch several frames in advance. But it confuses me because the frame rate is dynamic, especially for new screens with the FreeSync functionality that have dynamic refresh rates.
I have thought of a third possibility. One can use the clock directly in the shader. GL_EXT_shader_realtime_clock provides clockRealtimeEXT. It has no defined unit, and will wrap when exceeding the maximum value. But it is said "globally coherent by all invocations on the GPU". During initialization, you can measure its rate using a uniform buffer, and then assume the rate will be constant. And also manage the wrapping.
Then if you can write your shaders as a function of time, for example in a translation, that would be efficient. You just need the initial data. Remember that one must avoid if conditions in shaders.

What device/instrument/technology should I use for detecting object’s lying on a given surface?

First of: Thanks for taking the time to help me with my problem. It is much appreciated :)
I am building a natural user interface. I’d like the interface to detect several (up to 40) objects lying on it. The interface should detect if the objects are moved on it’s the canvas. It is not important what the actual object on surface is
e.x. “bottle”
or what color it has – only the shape and the placement of the object is of interest
e.x. “circle” .
So far I’m using a webcam connected to my computer and Processing’s blob functionality to detect the objects on the surface of the interface (see picture 1). This has some major disadvantages to what I am trying to accomplish:
I do not want the user to see the camera or any alternative device because this is detracting the user’s attention. Actually the surface should be completely dark.
Whenever I am reaching with my hand to rearrange the objects on the interface, the blob detection gets very busy and is recognizing objects (my hand) which are not touching the canvas directly. This problem can hardly be tackled using a Kinect, because the depth functionality is not working through glass/acrylic glass – correct me if I am wrong.
It would be nice to install a few LEDs on the canvas controlled by an Arduino. Unfortunately, the light of the LEDs would disturb the blob detection.
Because of the camera’s focal length, the table needs to be unnecessarily high (60 cm / 23 inch).
Do you have any idea on an alternative device/technology to detect the objects? Would be nice if the device would work well with Processing and Arduino.
Thanks in advance! :)
Possibilities:
Use Reflective tinted glass so that the surface would dark or reflective
Illuminate the area, where you place the webcam with array of IR LED's.
I would suggest colour based detection and contouring of the objects.
If you are using colour based detection convert frames to HSV and CrCb colour space. These are much better for segmentation of required area while using colour based detection.
I do recommend you to check out https://github.com/atduskgreg/opencv-processing. This interfaces Open-CV with processing, you will be getting lot functionalities of Open-CV in processing .
One possibility:
Use a webcam with infrared capability (such as a security camera with built-in IR illumination). Apparently some normal webcams can be converted to IR use by removing a filter, I have no idea how common that is.
Make the tabletop out of some material that is IR-transparent, but opaque or nearly so to visible light. (Look at the lens on most any IR remote control for an example.)
This doesn't help much with #2, unfortunately. Perhaps you can be a bit pickier about the size/shape of the blobs you recognize as being your objects?
If you only need a few distinct points of illumination for #3, you could put laser diodes under the table, out of the path of the camera - that should make a visible spot on top, if the tabletop material isn't completely opaque. If you need arbitrary positioning of the lights - perhaps a projector on the ceiling, pointing down?
Look into OpenCV. It's an open source computer vision project.
In addition to existing ideas (which are great), I'd like to suggest trying TUIO Processing.
Once you have the camera setup (with the right field of view/lens/etc. based on your physical constraints) you could probably get away with sticking TUIO markers to the bottom of your objects.
The software will pickup detect the markers and you'll differentiate the objects by ID, but also be able to get position/rotation/etc. and your hands will not be part of that.

Generating silent audio track

I'm using a simple DirectShow graph to convert some videos to WMV format, which is working fine. I'm now trying to use a filter based on the Synth Filter sample to supply a silent audio track to the videos and I'm running into some problems.
Essentially, I don't know how to stop the graph when this filter (the synth filter) is connected. I guess because it just provides samples forever until somebody tells it to stop, the usual approach of calling IMediaEvent::WaitForCompletion on the filter graph doesn't work (the graph never stops). What I want it to do of course is stop as soon as the video source filter is finished.
I've tried tracking the position of the graph with IMediaSeeking::GetPositions and then manually stopping the graph when this exceeds the duration of the source file, but the accuracy of the stop time with this approach isn't great.
Can anyone think of a better way to do this? Do I need to have another filter that monitors the output from the video source and also has a pointer to the audio source so it can stop it as soon as the video source delivers EndOfStream? Is there no way to accomplish this from purely application-side code?
I've done something not too different myself in the past. I added support for IMediaSeeking to the silence generator filter, and then you need to make sure that you set start and stop times for the conversion (even if it's just 0 and duration), so that the silence generator can generate the right amount of audio and then send EOS.
G

SoundMixer.computeSpectrum with microphone

Flex has the SoundMixer.computeSpectrum function that lets you compute an FFT from the currently playing sound. What I'd like to do is compute an FFT without playing the sound. Since Flash 10.1 lets us access the microphone bytes directly, it seems like we should be able to compute the FFT directly off of what the user is speaking.
Unfortunately this doesn't work as far as I know. As stated on the Adobe help pages:
The SoundMixer.computeSpectrum()
method lets an application read the
raw sound data for the waveform that
is currently being played. If more
than one SoundChannel object is
currently playing the
SoundMixer.computeSpectrum() method
shows the combined sound data of every
SoundChannel object mixed together.
This implies two drawbacks:
It just works on the output (SoundChannel)
It just works on the mix of all outputs.
If you don't need the output channel at all, you may turn down it's volume to zero or near to zero!? Don't know if that could work.
For myself I don't see any other chance at the moment to implement the FFT on my own to compute a spectrum on the microphone data.
I'm not sure if there's a way to pass that data, but if all else fails, you can always compute the FFT yourself.

How to play multiple sounds in AS2 or AS3 with a custom delay?

I need to realize a multiple-track player. The user can upload multiple tracks and mix (play them together). My problem is to allow the user to define an exact start position of each track to allow a synchronization between them, something like this:
Track 1: start at [x] sec.
Track 2: start at [y] sec.
play/stop
where the user can set the x and y. I've tried to realize it with AS2 (using netstream and setInterval) and AS3 (using netstream or sound and timer). Only if I set the same x and y both tracks are playing simultaneously.
Suppose you have a timeline "engine" that has an internal clock of some kind. every "tick" of the clock you will check some array or vector that holds your Track objects and see if it contains an object with a startTime of n ticks from the beginning of the timeline. Or Maybe its more efficient to make a Vector of starttimes that exist in the TrackObjs Vector and check that, then if one is found run the TrackObjs vector and get all the audio that needs to be started at that time.
Here ticks could be seconds, 10ths, milliseconds, whatever.
see http://as3.casalib.org/docs/ org.casalib.time classes for framebased timekeeping
class Track() {
var startTime:int;
var trackName:String;
var fileName:String;
}
For the actual playing of the multiple mixed sounds there are various libs out there that might do most of the heavy lifting for you.
http://www.gaiaflashframework.com/wiki/index.php?title=Sound_Groups
This may have some useful code for you, though you may need to decouple it from the Gaia framework.
Maybe better:
Matt Przybylski's SoundManager class http://www.reintroducing.com
Guttershark SoundManager class http://codeendeavor.com/guttershark
Also these might be of interest for dynamic sound generation:
http://code.google.com/p/benstucki/
"Flaudio is a dynamic runtime audio generation and processing library for ActionScript 3"
http://code.google.com/p/popforge/
"Popforge AS3 audio library allows you to create a valid flash.media.Sound object with your own samples. This opens up new perspectives for sound design with the current Adobe Flash Player 9. You can create synthesizers, effects and sample-players of any kind. The supplied AudioBuffer class allows you to create endless audio playback. "

Resources