frequency analysis of sound - frequency-analysis

I record birds cries with two microphones. The records can go up to 3 hours and it is time-consuming on audacity to listen to the whole file each day. What I want is a script that takes my original file and gives me a bunch of short audio files, each containing a bird cry. With my microphones I am able to record in mp3 or wav. But the script should take only cries that have a higher frequency than nHz. This frequency represents the background sound that is fixed and that should not be saved. I don't know which language is the best for that and I have absolutly no idea how to do that.
Thank you all,
Thomas

This should be pretty easily doable in a variety of languages but Python is a decent place to start. I'll link you some relevant resources to get you started and then you can narrow your question if you run into problems.
To read your audio file in .wav format look at this documentation.
To take the data from your audio file and put it into a numpy array see this question and answer.
Here is the documentation for computing the Fourier transform of your data (to get the frequency content).
I would suggest taking a moving window and computing the Fourier transform of the data within that window and then saving the result to a file if there's significant content above your threshold frequency. The first link should have info on saving the audio file.
You can get some background on using the Fourier transform for this type of application from this Q&A and if it turns out that your problem is really difficult, I would suggest looking into some of the methods for speech detection.
For a more out-there suggestion, you could try frequency shifting your recording by adjusting the sample rate to make bird sounds resemble human speech and then use a black box tool like Googles VAD to pick out the bird calls. I'm not sure how well that would work though.

The problem of cutting up a long file into sections of interest is usually referred to as (automatic) Audio Segmentation. If you are willing to have a fixed audio clips out (say 10 seconds), you can also treat it as an Audio Classification problem.
The latter is very well studied problem, also applied to birds.
The DCASE2018 challenge had one taks about Bird Detection, and has lots of advanced methods. Basically all the best performing systems use a Constitutional Neural Network on log-scaled mel-spectrograms. A mel-spectrogram is 2D, so it basically becomes image classification. Many of the submissions are open source, so you can look at the code and play with them. Do note they are mostly focused on scoring well in a research competition, not to be practical tools for splitting a few files.
If you want to build your own model for this, I would recommend going with a Convolutional Neural Network pretrained on images, then pretrain on DCASE2018 data, then test it on your own data. That should give a very accurate system, though it will take a while to set up.

Related

Autograding of NOPS exams: Implementation & extension to string questions?

We are using R/exams to create tests in Canvas and TestVision.
We have other forms and other software to perform written exams.
I know R/exams has a great NOPS feature and was wondering:
What software is used to autograde the NOPS forms?
Can that software also evaluate string questions?
Now it looks that the NOPS form doesn't make it easy for software to read parts. Ideally the software would be adapted so adapted NOPS forms (changes in blue) could read more easily Student Name, and string questions:
NOPS format
The NOPS forms have not been designed by us but they follow the format that our university has been using. We simply mimicked their format because we initially just generated the PDF files ourselves but used the commercial scanning software of our university.
Scanning
However, over the years we have written our own scanner implementation in R in exams::nops_scan(). The basic approach is to convert PDF pages to PNG images, read these into R, convert them to black and white pixel matrices, find the scanner markings in the corners, and then extract just the boxes relative to these markings. The boxes either contain printed digits in a fixed font for which a simple decision tree yields a reliable classification - or the boxes are empty/filled vs. checked which can also be classified reasonably reliably. The result is stored in a simple text format that was again not developed by us but to be fully compatible with the commercial system that our university used.
Grading
Based on the scan results the function exams::nops_eval() computes points and grades. Various evaluation strategies can be plugged in and starting from version 2.4-0 the reports generated by the function can be customized.
Extension to OCR
At the moment no OCR (optical character recognition) is used, except for the simple task of recognizing printed numbers in a fixed font. But no hand-written characters or digits are ever evaluated automatically. I had played around with this a little bit using tesseract but the results were not reliable enough for our purposes.
The string questions that are currently supported are intended for open-ended questions. Hence students get a reasonable amount of space to write something down. The teacher can then grade the answer sheet manually, again by ticking boxes only, which can be read rather reliably. The scanned images of the full sheet are included in the report for the students so that they can also see any hand-written feedback/corrections included in the answer form.
Tutorial
A hands-on guide to using the NOPS approach is available at: http://www.R-exams.org/tutorials/exams2nops/
Misc
Unfortunately, the system is not implemented in a very modular fashion. The reasons for this were two-fold: (1) We followed very closely the given format our university had been using. (2) The bulk of the implementation was written under a lot of time pressure (see the anecdote below). So while the features you propose would be nice to have, they are unlikely to fit well into the current setup. If you would want to have a stab at this, I would recommend to write a modular new implementation, just using the bits and pieces from the existing code that are useful enough.
Anecdote: Scanning of about 400-500 exam sheets had failed on the university system due to a mistake of the copy shop that had printed the sheets. It was mid-July, everybody was on vacation already including myself. So I sat on my parents porch for two days to write the scanner tool and evaluate the exams that the students were waiting for.

GLTF on demand and LOD for masive GLTF load

I am trying to load a very complex set of GLTF models in AFRAME.
My problem is very simple; my goal is to try to load about 9 million of gltf models in a unique scene.
My idea was to combine different level of detail in GLTF models depending on the camera distance and also only load those gltfs which are visible by the camera. If not the problem is that the assets are loaded in memory and my browser gets finally hung due to memory consumption.
Is this possible in AFRAME?
With some attention to A-Frame best practices, you should be able to make a performant scene with tens of thousands or even hundreds of thousands of polygons. But it will not be possible to load millions of distinct glTF models simultaneously in A-Frame, or any WebGL renderer for that matter.
Assuming you just want to show as many as models possible, try to take advantage of certain special cases:
If you need to render many copies of the same model, you can use a technique called "instancing". Check out aframe-instancing for some example code on how to do that. Depending on the complexity of your model, you may be able to show thousands (but probably not millions) of copies at once.
If you're making something like an RPG — which needs many things in the world, but only a few are in sight at any given time — then you can be clever about dividing your world into zones, and only loading models for the current zone.
Both of these are non-trivial to implement, and beyond the scope of a Stack Overflow question. My suggestion would be to try to get started on your own, and when you run into trouble, post new questions with the minimum amount of code necessary to see what you're trying to do. You may also find the A-Frame Slack group to be useful.

Convert graph in to data points using Mechanical Turk?

I looked around but did not see anyone using Mechanical Turk for this. I've heard of the service, but never used it before. I need to take the following graph and digitize it so I get a list of data points for each line (noting that there are two Y-axes, and thus depends on which line we are talking about). This is pretty time consuming for me, and I saw other posts on StackOverflow about digitizing software doing a poor job at this. Would Mechanical Turk be well suited to my task?
Here is the graph for reference: http://www.yourpicturehost.com/dyno_hbspeed.jpg
Depends how many of these you have. Mechanical turk could work quite well, but you'd have to check the accuracy carefully (eg by re-plotting the graphs, and comparing them yourself).
If you have a lot, though - you should be able to design an image processing algorithm to pick up the data.

What are the options and best practices for PV3D inspired modeling

The studio I work at is currently developing the Tony Hawk XI website and I am responsible for the flash/AS3 development. As part of the pitch, I entered an augmented reality skateboard example to be shown which impressed the client very much.
After a few weeks of getting stronger with Papervision3D, and getting to know the Flar Toolkit, I have successfully imported md2 and dae files that load and interact with my custom marker.
Now it has come time to develop some of my own models; I will be using 3DSMAX. I want to know what the limitations are on things like poly-count, character rigging and animation, texturing, tricks for exporting and creating the proper format file and any other bits of information that may save me some serious headaches down the road.
Currently I have a Quake2 MD2 model, Ernie, pulled inside of a FlarToolkit demo here.
This is very low-poly and I was wondering how many polys could I expect to get away with being that today's machines are so much faster;
Brian Hodgeblog.hodgedev.com hodgedev.com
I've heard that 2000 polys is about the threshold for good performance. In practice though, its been hit or miss and a lot of things can have an impact. So far I've run into perfomance hits when using animated movieclip materials, animated materials with an alpha chanel and precise materieals.
Having to clip objects seems to be a double edged sword. In some cases, it will increase performance by a good deal, and in others (seems to be primarily when there are alot of polys on the edge of the viewport) it'll drop the framerate by a good 10-15 fps. So, I'd say the view you setup is something to think about as well.
For example, we have a model of an interior of a store with some shelves and products and customers walking around. In total we have just under 600 triangles (according to the StatsView, which you should check out if you haven't yet: org.papervision3d.view.stats.StatsView). On my computer, which is a new computer with a quad core it runs at a steady 30fps (which is where we want it), but on an old Dell XPS (Pentium 4) it runs between 20 and 30fps depending on what objects are being clipped, etc.
We try to reduce the poly count and texture creatively to fix as many of the performance issues as possible. Unfortunatley our minimum specs are really low, so we need to do alot to get it to run well.
Edit:
Another thing we're doing is swapping out less detailed models for higher detailed ones when zoomed in. If you aren't zooming at all, than this probably won't help.
Hope that helps a bit.

Generating a picture/graphic of a graph

In working on a shortest path algorithm across a network I would like to generate a picture of the network. I'd like to represent nodes (circles), links (lines), cost to traverse the link (number in the middle of the link line), and capacity of the link (number on the link line next to the node it represents) in the picture. Is there any library/software out there that would help to automate creating this picture?
I can do this manually in Visio or with some drawing application but I'd like to generate them from code as I change/tweak the network.
Sounds like a job for GraphViz , it generates graphs from a short text description file. I've used it to produce connected node graphs and I believe it should be possible to add link labels, as you require.
If you're using python, Nodebox draws pretty graphs.
One of the big problems in displaying networks like this is figuring out where to put the nodes on the display screen. If arranging nodes is logically simple given your network, then an off-the-shelf product is likely to suit your needs.
If the arrangements are much more complicated, you may have to accept a certain amount of manual intervention to get this to work with off-the-shelf stuff, or byte the bullet and program the whole thing yourself.
.NET is one choice, and once you've mastered the Graphics class it's easy to use and plenty fast for something like this. However, there are probably better languages/platforms than .NET for something graphics-oriented like this.
Update: .NET is much better for 2D graphics than I knew. The key is finding a fast workaround to the pitifully slow GetPixel() and SetPixel() methods in the Bitmap class. Once you can read and write individual pixels easily and quickly, you can do whatever you want as a programmer.
Did you by chance check out the R programming language? I'm not positive but I believe that you can make images and such out of graphs. r-project.org
There are a bunch of visualizations of various algorithms here: Algorithmics Animation Workshop

Resources