Related
I am trying to build an Alexa Skills Kit, where a user can invoke an intent by saying something like
GetFriendLocation where is {Friend}
and for Alexa to recognize the variable friend I have to define all the possible values in LIST_OF_Friends file. But what if I do not know all the values for Friend and still would like to make a best match for ones present in some service that my app has access to.
Supposedly if you stick a small dictionary into a slot (you can put up to 50,000 samples), it becomes a "generic" slot and becomes very open to choosing anything, rather than what is given to it. In practice, I haven't had much luck with this.
It is a maxim in the field of Text To Speech that the more restrictive the vocabulary, the greater the accuracy. And, conversely, the greater the vocabulary, the lower the accuracy.
A system like VoiceXML (used mostly for telephone prompt software) has a very strict vocabulary, and generally performs well for the domains it has been tailored for.
A system like Watson TTS is completely open, but makes up for it's lack of accuracy by returning a confidence level for several different interpretations of the sounds. In short, it offloads much of the NLP work to you.
Amazon have, very deliberately, chosen a middle road for Alexa. Their intention model allows for more flexibility than VoiceXML, but is not as liberal as a dictation system. The result gives you pretty good options and pretty good quality.
Because of their decisions, they have a voice model where you have to declare, in advance, everything it can recognize. If you do so, you get consistent and good quality recognition. There are ways, as others have said, to "trick" it into supporting a "generic slot". However, by doing so, you are going outside their design and consistency and quality suffer.
As far as I know, I don't think you can dynamically add utterances for intents.
But for your specific question, there is a builtin slot call AMAZON.US_FIRST_NAME, which may be helpful.
I have an idea for a business that requires a well designed web application. I'm not a rocket surgeon, but I'm smart enough to know that you get what you pay for and am willing to pay for talent. However, I want the development process to go as smoothly as possible and would like to know how to make that happen.
So, what information do developers need (or want) initially from the owner to avoid having to make assumptions about business (or other) requirements? Do I need to create state transition diagrams or write use cases?
Essentially, how do I take the concept in my head and package it in a way that allows the developer to do what they do best? (assuming that is creating good software. haha)
Any advice is appreciated.
Shawn
You may need to reword your question, as it is too general to get a good answer, so some vague details would be helpful.
But, the better vision you have of what you want the smoother it will be.
I find UML diagrams too confining, when you aren't going to be doing the work, as you may not come up with the best design.
So, if you start with designing out what each page should look like, as you envision it, then you can write up use cases, which are short scenarios.
So, you may write up:
A user needs to be able to log in using OpenID.
This will tell the developer one function that you want, and who you expect to do that action.
But, don't put in technologies, as you may think that a SOAP service is your best bet, but upon talking about it you may find that there is a better solution.
Use cases are good points to show what you are envisioning, and give text to your page designs.
Talk to the developers. Explain what you want and why you want it. Together you make the flow charts and whatnot. Writing requirements is part of the design process, and it's a good idea to have the developers onboard as soon as possible. Start simple and small, then grow and expand while iterating.
In talking over web services before, I have found the best starting point is drawing on a sheet of paper what you think the site will look like, and add in a few arrows from things you want clickable to the pages that should result. Keep it simple, nothing too fancy, and hopefully you and the developer can come to an understanding of what you want pretty quickly.
Use cases might be best for checking off all the points later in the project about how complete your site is; I haven't really found it to be a helpful starting point, but I'm sure others disagree. (They just seem too tedius to read when actually writing code.)
Same with state transition diagrams; they are too tedious and I think most developers will assume you made mistakes in them anyway. :) Everyone else does... Unless your project hinges very tightly on the correctness of a state machine, I wouldn't really bother.
This book contains some good advice on what constitutes a good statement of requirements from a programmers point of view. It also has the useful guideline of not trying to set the form of your requirements too early, and a substantial piece on describing the problem you are trying to solve.
I like UI mockups based on actual program/site flows e.g registering a customer or placing order. Diagrams/pictures of GUIs with structured, consistent data examples are unambiguous.
I agree that UML and use cases are only really useful if everyone speaks UML and the projects are of sufficient complexity (few are).
You may want to read up on Agile/Scrum techniques. These are becoming a sort of standard and when properly managed can save weeks of development time.
I find that words don't do a good job of communicating how a system is supposed to work. Wireframes, white-board drawings/transition diagrams, and low-fidelity prototypes are great ways to communicate a concrete idea. One example of a low-fidelity prototype is a "clickable" paper prototype that allows a user to touch "buttons" on paper to go from one drawing to another. It costs very little time (cheaper), but goes a long way to communicate an idea between two parties.
Stay away from formal documentation, UML diagrams, or class (technical documentation) diagrams that don't speak to you. This is what large, risk-averse companies move toward to be more "mature". These are also byproducts of an idea that is hashed out, and it sounds like you're in the hashing out stage.
I'm new to this field - but I need to perform a WAV-to-MIDI conversion in java.
Is there a way to know what exactly are the steps involved in WAV-to-MIDI conversion?
I have a very rough idea as in you need to;
sample the wav file, filter it, use FFT for spectral analysis, feature extraction and then write the extracted features on to MIDI.
But I cannot find solid sources or papers as in how to do all that?
Can some one give me clues as in how and where to start?
Are there any Open Source APIs available for this WAV-to-MIDI conversion process?
Advance thanks
It's a more involved process than you might imagine.
This research problem is often referred to as music transcription: the act of converting a low-level representation of music (e.g., waveform) into a higher-level representation such as MIDI or even sheet music.
The sophistication of your solution will depend upon the complexity of your input data. Tons of research papers address music transcription only on monophonic piano or drums... because they are easy to transcribe. (Relatively.) Violin is harder. Voice is even harder. Violin plus voice plus piano is much harder. A symphony is nearly impossible. You get the picture.
The basic elements of music transcription involve any of the following overlapping areas:
(multi)pitch estimation
instrument recognition, timbral modeling
rhythm detection
note onset/offset detection
form/structure modeling
Search for papers on "music transcription" on Google Scholar or from the ISMIR proceedings: http://www.ismir.net. If you are more interested in one of the above subtopics, I can point you further. Good luck.
EDIT: That being said, there are existing solutions that we can all find on the web. Feel free to try them. But as you do, evaluate them with a critical eye and ear. What types of audio signals would cause transcription to fail?
EDIT 2: Ah, you are only doing this for piano. Okay, this is doable. Music transcription has advanced to the point where it can transcribe monophonic piano pretty well. A Rachmaninov concerto will still pose problems.
Our recommendations depend upon your end goal. You state "need to perform... in Java." So it sounds like you just want something to work regardless of how it gets you there. In that case, I agree 100% with others: use something that exists.
That's actually an interesting question; all of the MIR libraries I know are typically C/C++/Python/Matlab. But not Java. The EchoNest has a Java API, but I don't think it does note-level transcription. http://developer.echonest.com. (Edit: It does note-level transcription. The returned data includes pitch, timbre, beat, tatum, and more. But I find polyphony is still a problem.)
Oh, Marsyas is Java-based. Cool. I thought it was just C++. http://marsyas.info/ I recommend this. It's developed by George Tzanetakis, a professor in MIR. It does signal-level analysis and should be a good option.
Now, if this is for a fun learning experience, I think you can use the sound manipulation utilities in Java to experiment with the WAV signal and see what comes out.
EDIT: This page describes MIR software better than I can: The Tools We Use
For Matlab, you may be interested in the MIR Toolbox
Here is a nice page of common datasets: MIR Datasets
This is a very big undertaking for being new to the field, unless you mean you are familiar with signal analysis and feature detection in general and want to look more specifically into automatic transcription.
There is no API for WAV to MIDI conversion. Vamp is a framework for feature extraction plugins, but to do automatic transcription you would need to use all the functionality of the existing plugins, plus implement functionality that exists in none of them yet.
Browse through the descriptions of the plugins on the vamp download page, any descriptions you do not understand are topics you should start researching if you want to do this.
If you don't need to automate this task (ie, for a website where people can upload MP3's and get MIDI files back), then you should consider using a tool like Melodyne which is already quite good at going this. As Steve noted, this is a very difficult task to accomplish, and even the best algorithms and solutions present at the moment are not 100% reliable.
So if you are just doing studio work and need to do a few conversions, it will probably save you a bit of time (and lots of headache) to use a tool already designed for this task.
This is a field which is still highly under development, yet, there are some (experimental) algorithms available.
You can install sonic annotator and use a few vamp plugins.
For example:
./sonic-annotator file.wav -d vamp:qm-vamp-plugins:qm-transcription:transcription -w midi
./sonic-annotator file.wav -d vamp:silvet:silvet:notes -w midi
./sonic-annotator file.wav -d vamp:ua-vamp-plugins:mf0ua:mf0ua -w midi
Dolphin, sorry to be brusque, but you have completely underestimated the problem. What you want to achieve - a full piano sound transcription involving all parameters that were used while playing would need an enormous amount of research with people who have worked in the field for many years. Even a group of PhDs in signal processing would have to invest a lot of work to even come close to what you mean. Music transcription has needed decades of work to even work halfway reliable. I'd suggest you pick a different problem which you can manage better than this.
I'm interested in learning more about pattern recognition. I know that's somewhat of a broad field, so I'll list some specific types of problems I would like to learn to deal with:
Finding patterns in a seemingly random set of bytes.
Recognizing known shapes (such as circles and squares) in images.
Noticing movement patterns given a stream of positions (Vector3)
This is a new area of experimentation for me personally, and to be honest, I simply don't know where to start :-) I'm obviously not looking for the answers to be provided to me on a silver platter, but some search terms and/or online resources where I can start to acquaint myself with the concepts of the above problem domains would be awesome.
Thanks!
ps: For extra credit, if said resources provide code examples/discussion in C# would be grand :-) but doesn't need to be
Hidden Markov Models are a great place to look, as well as Artificial Neural Networks.
Edit: You could take a look at NeuronDotNet, it's open source and you could poke around the code.
Edit 2: You can also take a look at ITK, it's also open source and implements a lot of these types of algorithms.
Edit 3: Here's a pretty good intro to neural nets. It covers a lot of the basics and includes source code (albeit in C++). He implemented an unsupervised learning algorithm, I think you may be looking for a supervised backpropagation algorithm to train your network.
Edit 4: Another good intro, avoids really heavy math, but provides references to a lot of that detail at the bottom, if you want to dig into it. Includes pseudo-code, good diagrams, and a lengthy description of backpropagation.
This is kind of like saying "I'd like to learn more about electronics.. anyone tell me where to start?" Pattern Recognition is a whole field - there are hundreds, if not thousands of books out there, and any university has at least several (probably 10 or more) courses at the grad level on this. There are numerous journals dedicated to this as well, that have been publishing for decades ... conferences ..
You might start with the wikipedia.
http://en.wikipedia.org/wiki/Pattern_recognition
This is kind of an old question, but it's relevant so I figured I'd post it here :-) Stanford began offering an online Machine Learning class here - http://www.ml-class.org
OpenCV has some functions for pattern recognition in images.
You might want to look at this :http://opencv.willowgarage.com/documentation/pattern_recognition.html. (broken link: closest thing in the new doc is http://opencv.willowgarage.com/documentation/cpp/ml__machine_learning.html, although it is no longer what I'd call helpful documentation for a beginner - see other answers)
However, I also recommend starting with Matlab because openCV is not intuitive to use.
Lot of useful links on this page on computer vision related pattern recognition. Some of the links seem to be broken now but you may find it useful.
I am not an expert on this, but reading about Hidden Markov Models is a good way to start.
Beware false patterns! For any decently large data set you will find subsets that appear to have pattern, even if it is a data set of coin flips. No good process for pattern recognition should be without statistical techniques to assess confidence that the detected patterns are real. When possible, run your algorithms on random data to see what patterns they detect. These experiments will give you a baseline for the strength of a pattern that can be found in random (a.k.a "null") data. This kind of technique can help you assess the "false discovery rate" for your findings.
learning pattern-recoginition is easier in matlab..
there are several examples and there are functions to use.
it is good for the understanding concepts and experiments...
I would recommend starting with some MATLAB toolbox. MATLAB is an especially convenient place to start playing around with stuff like this due to its interactive console. A nice toolbox I personally used and really liked is PRTools (http://prtools.org); they have an implementation of pretty much every pattern recognition tool and also some other machine learning tools (Neural Networks, etc.). But the nice thing about MATLAB is that there are many other toolboxes as well you can try out (there is even a proprietary toolbox from Mathworks)
Whenever you feel comfortable enough with the different tools (and found out which classifier is perfomring best for you problem), you can start thinking about implementing the machine learning in a different application.
I have the task of developing an online examination software for a small university, I need to implement measures to prevent cheating...
What are your ideas on how to do this?
I would like to possibly disable all IE / firefox tabs, or some how log internet activity so I know if they are googling anwsers...is there any realistic way to do such things from a flex / web application?
Simply put, no there is no realistic way to accomplish this if it is an online exam (assuming they are using their own computers to take the exam).
Is opening the browser window full screen an option? You could possibly also check for the window losing focus and start a timer that stops the test after some small period of time.
#Chuck - a good idea.
If the test was created in Flash/Flex, you could force the user to make the application fullscreen in order to start the test (fullscreen mode has to be user-initiated). Then, you can listen for the Flash events dispatched when flash exits fullscreen mode and take whatever appropriate action you want (end the test, penalize the user, etc.).
Flash/Flex fullscreen event info.
blog.flexexamples.com has an example fo creating a fullscreen-capable app.
Random questions and large banks of questions help. Randomizing even the same question (say changing the numbers, and calculating the result) helps too. None of these will prevent cheating though.
In the first case, if the pool is large enough, so that no two students get the same question, all that means is that students will compile a list of questions over the course of several semesters. (It is also a ton of work for the professors to come up with so many questions, I've had to do it as a TA it is not fun.)
In the second case, all you need is one smart student to solve the general case, and all the rest just take that answer and plug in the values.
Online review systems work well with either of these strategies (no benefit in cheating.) Online tests? They won't work.
Finally, as for preventing googling... good luck. Even if your application could completely lock down the machine. The user could always run a VM or a second machine and do whatever they want.
My school has always had a download link for the Lockdown browser, but I've never taken a course that required it. You can probably force the student to use it with a user agent check, but it could probably be spoofed with some effort.
Proctored tests are the only way to prevent someone from cheating. All the other methods might make it hard enough to not be worth the effort for most, but don't discount the fact that certain types of people will work twice as hard to cheat than it would have taken them to study honestly.
Since you can't block them from using google, you've got to make sure they don't have time to google. Put the questions in images so they can't copy and paste (randomize the image names each time they are displayed).
Make the question longer (100 words or more) and you will find that people would rather answer the question than retype the whole thing in google.
Give them a very short time. like 30-45 seconds. Time to read the question, think for a moment, and click either A, B, C, D, E,
(having just graduated from CSUN I can tell you scantron tests work.)
For essay questions? do a reverse google lookup (meaning put their answer into google as soon as they click submit) and see if you get exact matches. If so, you know what to do.
Will they always take the test on test machines, or will they be able to take the test from any machine on the network? If it will be specific machines, just use the hosts file to prevent them from getting out to the web.
If it is any machine, then I would look at having the testing backend change the firewall rules for the machine the test is running on so the machine cannot get out to the interwebs.
I'd probably implement a simple winforms (or WPF) app that hosts a browser control in it -- which is locked in to your site. Then you can remove links to browsers and lock down the workstations so that all they can open is your app.
This assumes you have control over the workstations on which the students are taking the tests, of course.
As a teacher, I can tell you the single best way would be to have human review of the answers. A person can sense copy/paste or an answer that doesn't make sense given the context of the course, expected knowledge level of the students, content of the textbook, etc, etc, etc.
A computer can do things like check for statistical similarity of answers, but you really need a person for final review (or, alternatively, build a massive statistical-processing, AI stack that will cost 10x the cost of human review and won't be as good ;-))
No, browsers are designed to limit the amount of damage a website or application can do to the system. You might be able to accomplish your goals through Java, an activex control, or a custom plugin, but other than that you aren't going to be able to 'watch' what they're doing on their system, much less control it. (Think if you could! I could put a spy on this webpage, and if you have it open I get to see what other websites you have open?)
Even if you could do this, using a browser inside a VM would give them the ability to use one computer to browse during the test, and if you could fix that they could simply use a library computer with their laptop next to it, or read things from a book.
The reality is that such unmonitored tests either have to be considered "open book" or "honor" tests. You must design the test questions in such a manner that references won't help solve the problems, which also means that each student needs to get a slightly different test so there is no way for them to collude and generate a key.
You have to develop an application that runs on their computer, but even then you can't solve the VM problem easily, and cannot solve the side by side computers or book problem at all.
-Adam
Randomize questions, ask a random set of questions from a large bank... time the answers...
Unless you mean hacking your site, which is a different question.
Short of having the application run completely on the user's machine, I do not believe there is a way to make sure they are not google-ing the answers. Even then it would be difficult to check for all possible loop-holes.
I have taken classes that used web based quiz software and used to work for a small college as well. For basic cheating prevention I would say randomize the questions.
Try adding SMS messages into the mix.
I agree with Adam, that even with the limitations that I suggested, it would still be trivial to cheat. Those were just "best effort" suggestions.
Your only hopes are a strong school honor code and human proctoring of the room where the test is being given.
As many other posters have said, you can't control the student's computer, and you certainly can't keep them from using a second computer or an iPhone along side the one being used for the test -- note that an iPhone (or other cellular device) can bypass any DNS or firewall on the network, since it uses the cellular provider's network, not the college's.
Good luck; you're going to need it.
Ban them from using any wireless device or laptop and keylog the machines?
You could enforce a small time window during which the test is available. This could reduce the chance that a student who knows the answers will be free to help one who doesn't (since they both need to be taking the test at the same time).
If it's math-related, use different numbers for different students. In general, try to have different questions for different copies of the test.
If you get to design the entire course: try to have some online homeworks as well, so that you can build a profile for each student, such as a statistical analysis of how often they use certain common words and punctuations. Some students use semi-colons often; others never, for example. When they take the test you get a good idea of whether or not it's really them typing.
You could also ask a couple questions you know they don't know. For example, list 10 questions and say they must answer any 6 out of the 10. But make 3 of the questions based on materials not taught in class. If they choose 2 or 3 of these, you have good reason to be suspicious.
Finally, use an algorithm to compare for similar answers. Do a simple hash to get rid of small changes. For example, hash an answer to a list of lower-cased 3-grams (3 words in a row), alphabetize it, and then look for many collisions between different users. This may sound like an obvious technique, but as a teacher I can assure you this will catch a surprising number of cheaters.
Sadly, the real trouble is to actually enforce punishment against cheaters. At the colleges where I have taught, if a student objects to your punishment (such as flunking them on the test in question), the administration will usually give the student something back, such as a positive grade change. I guess this is because the student('s parents) have paid the university a lot of money, but it is still very frustrating as a teacher.
The full screen suggestions are quite limited in their effectiveness as the user can always use a second computer or w/ multi monitor a second screen to perform their lookups. In the end it is probably better to just assume the students are going to cheat and then not count online tests for anything important.
If the tests are helpful for the students they will then do better on the final / mid term exams that are proctored in a controlled setting. Otherwise, why have them in the first place...
Make the questions and answers jpeg images so that you cannot copy and paste blocks of text into a search engine or IDE (if it is a coding test). This combined with a tight time limit to answer each question, say three minutes, makes it much harder to cheat.
I second what Guy said. We also created a Flex based examination system which was hosted in a custom browser built in .NET. The custom browser launched fullscreen, all toolbars were hidden and shortcuts were disabled.
Here is tutorial on how to create a custom browser with C# and VB.NET.
This will solve your problem. http://www.neuber.com/usermonitor/index.html
This will allow you to view the student's browser history during and after the test as well as look in on their screen during the test. Any urls visited during test time will be logged, so you can show them the log when you put a big F on their report card. :)
No one can stop people from cheating, but everyone can receive different questions altogether.
I prefer you buy available online scripts in market as starting point for it. This will save you time, cost and testing efforts.
Below is one of the fine scripts that I worked with and it worked like charm. Using this as base I developed a online testing portal of over 1000 users using computer adaptive test.
http://codecanyon.net/item/online-skills-assessment/9379895
It is a good starting point for people looking to develop Online Exam System.
I customized the script with the help of their support.