How choose audio fingerprint algorithm to create a cooperative music database? [closed] - audio-fingerprinting

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I need to create a cooperative music identification service. Every user will have an option to fingerprint a song and send it to the server with its meta information. At the beginning the service database will be empty and every time a music fingerprint will be received meta data for the song will be updated (the server will assign meta data to a finger print based on majority choise if different user will send different information for the same fingerprint).
I need to calculate a fingerprint for the whole song, I do not need to identify a song from just a fraction.
The fingerprint should not be 100% accurate. I will be happy if two song file will receive the same fingerprint just if the same file is encoded with different compression rate. A low level of noise independence will be a plus.
Silence at the begining or the end of the song will be not a problem, I should remove them using standard silence suppression algorithm (and also in this case a do not need very precise result).
I know there are some opensource library like: http://echoprint.me/ and https://acoustid.org/ but these libraries are excessive for my needs, because if I understood correctly they can identify a song from just a part, and this will create a heavy database. I need an algorithm that will give me a not too heavy (some kb) fingerprint for the whole song.
Which is the simplest and fastest algorithm I can use?
Thanks to all

I suggest you use the AcoustID project. Your description matches this project on a lot of points. Only some of their approaches are different from what you suggest.
Can the service identify short audio snippets?
No, it can't. The service has been designed for identifying full audio
files. We would like to eventually support also this use case, but
it's not a priority at the moment. Note that even when this will be
implemented, it will be still intended for matching the original audio
(e.g. for the purpose of tracklisting a long audio stream), not audio
with background noise recorded on a phone.
Have a look at their mailing list for some better explanations: https://groups.google.com/forum/#!forum/acoustid

Related

Is web scraping allowed? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm working on a project that requires certain statistics from another website, and I've created an HTML scraper that gets this data every 15 minutes, automatically. However, I stopped the bot now, as in their terms of use, they mention they do not allow it.
I really want to respect this, and especially if there's a law prohibiting me from taking this data, but I've been contacting them through email several times without a single answer, so now I've come to the conclusion that I'll simply grab the data, if it is legal.
On certain forums I've read that it IS legal, but I would much rather get a more "precise" answer here on StackOverflow.
And let's say that this is in fact not illegal, would they have any software to spot my bot making several connections every 15 minutes?
Also, when talking about taking their data, we're talking about a single number for each "team", and this number I will transfer in to our own number.
I'll quote Pablo Hoffman's (Scrapinghub co-founder) answer to "What is the legality of web scraping?", I found on other site:
First things first: I am not a lawyer and these comments are solely
based on my experience working at Scrapinghub, please seek legal
assistance accordingly.
Here are a few things to consider when scraping public data from websites (note that the following addresses only US law):
As long as they don't crawl at a disruptive rate, scrapers do not breach any contract (in the form of terms of use) or commit a crime
(as defined in the Computer Fraud and Abuse Act).
Website's user agreement is not enforceable as a browsewrap agreement because companies do not provide sufficient notice of the
terms to site visitors.
Scrapers accesses website data as a visitor,
and by following paths similar to a search engine. This can be done
without registering as a user (and explicitly accepting any terms).
In Nguyen v. Barnes & Noble, Inc. the courts ruled that simply placing a
link to a terms of use at the bottom of webpage is not sufficient to
"give rise to constructive notice." In other words, there is nothing
on a public page that would imply that merely accessing the
information is subject to any contractual terms. Scrapers gives
neither explicit nor implicit assent to any agreement, therefore
breaches no contract.
Social networks, for example, assign the value of becoming a user (based on call-to-action on public page), as the ability to: i) Gain access to full profiles, ii) Identify common friends/connections, iii) Get introduced to others, and iv) Contact members directly. As long as scrapers makes no attempt to perform any of these actions they do not gain "unauthorized access" to their services and thus does not violate CFAA
A thorough evaluation of the legal issues involved can be seen here: http://www.bna.com/legal-issues-raised-by-the-use-of-web-crawling-and-scraping-tools-for-analytics-purposes
There must be robots.txt file in root folder of that site.
There are specified paths, that are forbidden to harass with scrappers, and those, which is allowed (with acceptable timeouts specified).
If that file doesn't exists - anything is allowed, and you take no responsibility for website owners fail to provide that info.
Also, here you can find some explanation about robots exclusion standard.

Why use google analytics? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Okay, a lot of websites (about 50%) use google analytics. The idea is to know some basic information about your users. But I don't understand why the service is used by so many people, considering 3 things:
1) The code takes time to load. Even the async version takes time and the user sees the loading icon, a bad thing making it seem like your code is terrible or you can't pay a good hosting company.
2) It's a well know script and a some people block it.
3) Google (obviously) get's the data too. Now, don't get me wrong, but why give them free data sacrificing your uses privacy?
2 and 3 are not so important. 1 is. Given the above, what's the drawback of making your own analytics script and serving it to the users? What's the great thing google analytics does and you can't do on your own?
I would say two reasons:
A) It gives you a LOT of convenient visualizations and ways to slice the data - stuff that you would have to build independently. Again - if you just want to watch one number, it doesn't matter much, but you usually want a bigger picture and GA has put a lot of work into making most useful stuff easily available and easy to visualize.
B) Service reliability - basically, the first 10 iterations of whatever solution you choose to implement WILL have bugs (as any programmer who has worked on any meaningful projects knows).
Outsourcing your analytics to GA therefore just saves you a metric ton of time that it would take to reimplement everything yourself and get it working reliably.
As for speed issues - you can always disable GA on the few pages where speed is critical... although considering that page is usually the landing page of the app, that might not be too smart of an idea...
However - in the vast majority of cases, the async GA code is not really the bottleneck for your page. You are probably better off optimizing other aspects of your javascript on the landing page, as the "loading" icon is really something most users do not notice.

asp .net Application performance [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have an asp .net 4.0 application. I have an mdf file in my app_data folder that i store some data. There is a "User" table with 15 fields and an "Answers" table with about 30 fields. In most of the scenarios in my website, the user retrieves some data from "User" table and writes some data to "Answers" table.
I want to test the performance of my application when about 10000 users uses the system.What will happen if 10000 users login and use the system at the same time and how will the performance is affected ? What is the best practice to test my system performance of asp .net pages in general?
Any help will be appreciated.
Thanks in advanced.
It reads like performance testing/engineering is not your core discipline. I would recommend hiring someone to either run this effort or assist you with it. Performance testing is a specialized development practice with specific requirement sets, tool expertise and analytical methods. It takes quite a while to become effective in the discipline even in the best case conditions.
In short, you begin with your load profile. You progress to definitions of the business process in your load profile. You then select a tool that can exercise the interfaces appropriately. You will need to set a defined initial condition for your testing efforts. You will need to set specific, objective measures to determine system performance related to your requirements. Here's a document which can provide some insight as a benchmark on the level of effort often required, http://www.tpc.org/tpcc/spec/tpcc_current.pdf
Something which disturbs me greatly is your use case of "at the same time," which is a practical impossibility for systems where the user agent is not synchronized to a clock tick. Users can be close, concurrent within a defined window, but true simultaneity is exceedingly rare.

Why Don't Duplicate QR Codes Look The Same? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
My understanding in that a QR Code contains the data that is being read, and it does not require an internet connection to interpret the code. If this is the case, why do I get a different QR Codes every time I recreate a new QR with the same data?
I see definite differences if I use two different generators to create the same code. For instance, creating a URL link to http://www.yahoo.com creates two different QRs on these sites:
http://qrcode.kaywa.com/
http://zxing.appspot.com/generator/
Mind that QR codes may use 4 different levels of error correction, labeled L, M, Q and H, respectively. Also, there is a process called masking, with the intention to increase the robustness of the reading process by distributing the black and white pixels over the image. There are also a number of masking patterns available, which can produce a valid QR code, but with different results. Read the specification for more info on those.
That being said, given a generator with the same settings, the output should always be the same, which is what your original question was about. Now, comparing two different generators might result in observing two different images due to the effects mentioned above.
Spec link, randomly picked off of Google (I'm mentioning this because ISO is selling the QR specification as a standard document):
http://raidenii.net/files/datasheets/misc/qr_code.pdf
The two sites might use two different versions of the QR code standard.
This picture shows that certain areas of the code hold information about the version and format used, so two QR codes might differ in those areas. I really don't know how QR codes work, but I assume that a different version or format would also mean that the rest of the data is ordered or encoded differently.
http://en.wikipedia.org/wiki/File:QR_Code_Structure_Example.svg
They are same... Google & Nokia
Kaywa is different on eye but contains same info.
Anyway, QRC is not different on every generation.

Audio encryption/protection [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Other than coding, I spend alot of my time in a recording studio making music. I intend on selling my art both online and on cd, but I have one issue...protecting the audio file. I dont want people illegally distributing or making copies of my music so I need to protect it somehow. The one way that I've seen is to create my own player and the tracks can only be played using this player. Using a "PCID", and a private key, the player decrypts the adio and playsback. However, this will surely chase clients away because they wouldnt like the restriction of only using my player. Does anybody have any other ideas?
As creator of the music (assuming it is original music) you have copyright for the music, and legal remedies if people make copies and profit from it (or cause you loss). People who pay for music aren't going to bother pirating it, and people who pirate it aren't going to bother paying for it. Your odds of beating them via legal means are probably better than having a foolproof-yet-widespread protection model.
So, in case it wasn't obvious -- don't bother. Popularity/fame might probably bring you more value than your music.
In the end, any protection you devise can and will be broken. Instead of attempting to fight a losing battle, rather look at offering "value added" content to legitimate purchasers (CD sleeves, art elements, etc).
Additionally, you can look at using digital audio watermarks embedded in the audio files. Whilst this won't prevent unauthorised copying, it will allow you to identify the source of the original leak.
Well no matter what you do, people will find other way to do so and it will make people prevent from buying your music. What about making a deal with a Music Label Studio or something that can protect copy rights.
I don't know if I'm to late for an answer, but I did code some sort of audio encryption, that does have a player that plays the encrypted audio format. Also offers password protection, there is another version I finished but haven't uploaded that has input and output directories and a GUI. Here is the link for more info:
http://ronaldarichardson.com/2011/04/05/saf-secure-audio-format/
You can't avoid the analog hole.

Resources