Breaking CAPTCHAs for a Noble Purpose

Breaking CAPTCHAs for a Noble Purpose - accessibility

CAPTCHAs that ask users to read distorted text are fine for sighted people, but a terrible barrier for those who are blind or have other disabilities. Audio alternatives are occasionally available but still don't help those who are both deaf and blind and can be hard to use with a screenreader (which is already reading words to you).
There exist a couple of solutions that use humans to solve the CAPTCHA on behalf of the user, such as WebVisium and Solona, but these rely on the availability of volunteer operators (for example, Solona apparently has just one volunteer so you have to hope he is awake when you want help).
It occurs to me that the volume of CAPTCHA solutions needed by blind people is very low - I'd guess less than a few hundred per day in a populous country like the UK. This means that unlike the bad folks who want to perform an action many times in a short period, a CAPTCHA assistance service for blind people could afford to devote considerable computational resource - for example, a cloud of computers in Amazon EC2 - to identifying the presented text.
My question is this: assuming you don't care about speed very much, and you have lots of computers available, are there algorithms that let you solve the text-distortion CAPTCHAs that are common today, such as those used by reCaptcha? Or are these problems really intractable even with lots of resource and time?
A few notes:
At this point, my question is just theoretical, but clearly any such service would have to carefully control access to keep spammers out. Perhaps only registered blind people would be allowed to use it.
I am aware that an old Yahoo CAPTCHA was broken a few years ago using an algorithm that runs in seconds on a single computer. I am asking whether modern CAPTCHAs can be broken, perhaps more slowly and with more resource.
I am aware that some new CAPTCHA types are appearing, which ask users to identify kittens or orient a picture. These aren't widespread yet, so I'm just asking about text-distortion for now.

Basically solving a text distortion CAPTCHA consists of three individual steps:
Find out where the interesting parts are
Segment the text into individual letters
Recognize the letters
The only problem that's left which is pretty hard for computers is the second one. The first usually isn't very hard, unless you happen to stumble upon the CAPTCHA from hell. And the third gets solved by computers with a much better success rate than by humans.
An interesting site for learning how CAPTCHAs are broken is the one by the OCR Research Team.

CAPTCHA has been created to avoid machines from detecting the words. It's meant to be read by humans only. Making it more readable for blind/deaf people adds a risk of machines being able to understand them again, thus nullifying their effect.
Spammers did find a very effective way to break the more popular CAPTCHA's though. They just hire cheap labourers to read them, in return for a few cents per working account. As a result, there's a small industry around breaking CAPTCHA's to create millions of accounts that can then be used to send more spam. Compared to the amount gained by the spammers, the costs is almost none. A similar solution could be used by blind/deaf people, who would send the CAPTCHA image to some cheap labourer in China or wherever, where they will reply with the correct words and the blind/deaf person will be able to proceed. Unfortunately, blind people only need this service only a few times while spammers need a continuous flow, thus those labourers will prefer to work for spammers instead. (The pay is better.) Still, the best solution would be to send the CAPTCHA to some friend, let them read and/or decipher it and return the answer.
The ReCAPTCHA style also reads out the words. A simple speech recognition application might be able to recognise whatever is said, although speech recognition still needs more optimizations. Still, you might want to work from that angle, getting the application to listen to the sound byte instead.
When it is possible to break CAPTCHA's, they will just think of better CAPTCHA-like methods. OCR techniques are still improving thus more work will be done to make CAPTCHA's harder. That is, until OCR has become as good as the human eye at recognizing words...
An algorithm could be created, although slow. With 26 lowercase and 26 uppercase letters and 10 digits, it should not be too difficult to come up with an algorithm. With Serif and Sans-serif fonts, the number of combinations would need to be doubled, though. Still, if you try to curve all letters in a similar way as the letter in the CAPTCHA, you should be able to detect a letter which gets covered by the CAPTCHA letter the most. And that would be the most likely candidate. Still needs you to clear lines, dirt and other artefacts from the image that the human eye has less trouble to recognise than a computer. You'd need the following steps:
Clean up the image.
Detect the locations of the letters.
For every letter
3a. Determine the curve of the letter by checking the left side.
3b. Do an overlay of every possible letter/digit to find the one that covers it the best. (That's the most likely letter.)
Once you've found the word, do a dictionary check to make sure it's a real word. (Unless the CAPTCHA doesn't use real words.)
Even though they can twist the letters in the CAPTCHA's, it should be possible to detect the twist rotation that they used simply by looking at the left side of every letter and then trying to apply the same curve to every letter. (52 combinations, plus 10 digits, if digits are also used.) Basically, you'd try to put a box around every letter, then check which letter will contain the least amount of white space. That's the most likely letter.
The main reason why this isn't often used for OCR is basically the need for speed. Step 3a/b tends to be slow, especially if you have to take font style in consideration.
Making this answer bigger but in reply to one of the comments:
There are several ways to cleanup an image. You'd need some color filtering, noise reduction and an algorithm that's able to recognise the noisy lines through an image. The DEFCON slideshow that you've pointed to shows a few simple techniques to filter away some of the noise. It shows that a basic image processing tool can already make an image a lot clearer for a machine to read. A simple blur will clean up random dots and thin lines while color filters would filter away the noisy colours. A next step would be to try to put a box around every letter in the CAPTCHA, hoping the system is able to recognise their locations. I don't know any practical algorithms for this but there should be ways to recognise them. There's software that can create vector images from bitmaps, thus there should be software that's able to calculate a box around a letter.
It is likely that this box won't have rectangular corners, thus you would have to distort all 52 letters to match the same box. Italic or bold shouldn't make much of a difference since these styles are just additional distortions. Serif or Sans-serif does make a difference, though. Serif fonts tend to have a few more spikes and ornaments. Fortunately, there are algorithms that can transform a box to any other figure with four corners.
Regular OCR applications will assume that letters are mostly straight and will just check a few hotspots to find a match. Thus, they sometimes get it wrong because of noise. To crack CAPTCHA, you would need a more sensitive match, preferably "XOR-ing" the CAPTCHA letter image with an image of one of the 52 letters, then counting the number of black and white spots to calculate the ratio. Assuming white=1 and black=0, the result of the XOR should be almost black for the best match.
I think several spammers have already found some useful algorithms to crack CAPTCHA's but for them, keeping these algorithms a secret just keeps them in business.
Another comment, more text. :-)
Segmentation would be a problem, but it's not impossible to solve. It's just extremely complex. But when you've cleaned the image, it should be possible to calculate two lines. One line that touches the bottom of every letter and a second line that touches the top. However, good CAPTCHA's won't put letters on the same lines any more, but those not-so-good ones could be cracked by just following the lines. (Guess? ReCAPTCHA puts letters between two lines!) With two lines, you know the first letter will start at the left, thus you can try overlaying all 52 possibilities there until you've found a match. When you found one, move to the right for the second one. And further until you've read all letters. With two lines to guide you, you don't need a complete box.
Letters tend to use a constant ratio between width and height. With two lines, you can calculate the height of the complete letter and thus get a good estimation of the matching width.
Still, working out the correct algorithm to calculate this all is a bit too much for my poor math skills. You'd need an expert mathematician to crack this algorithm.

My answer to your question "are these problems really intractable even with lots of resource and time?" is to point out that this is the very reason that CAPTCHAs work.
My understanding is that the purpose of a CAPTCHA is to prove that you are human rather than a spam bot. reCAPTCHAs are a novel take on this theme because they take images that represent text that cannot be resolved by OCR (optical character recognition) engines. The difference between a person and a machine in this instance is that specialized algorithm(s) has tried to interpret this image and failed while a "normal" person has the intrinsic ability to interpret the text in a consistently human way. That being said, in the future we hope that someone will come up with better OCR engines so that there needs to be less human intervention in digitizing the worlds information. We hope that someone will come up with an tractable solution to this particular problem.
From your point of view of trying to make CAPTCHAs more accessible to blind people -- who still need to prove that they're people rather than spam bots -- the community needs to become aware of this issue and find a way to identify people in a less vision centric way.

The introduction of CAPTCHA has certainly made the web less accessible to the visually impaired, and I agree with you in citing this as a significant problem that deserves more attention and concern. However, while CAPTCHA can be and has been inconsistently bypassed on popular web sites, I don't think this is a viable long-term solution for those in need. Indeed, the day that the CAPTCHA variants currently present on sites like Facebook, Google, MySpace etc. can be reliably and consistently broken is the day they will become obsolete and abandoned for either stronger variants of the same or an entirely new solution (as you implied, distinguishing cats from dogs in pictures has been a popular alternative trend).
When it comes to online accessibility, what I think those with disabilities need most right now is advocacy. The more people contact software companies, open source groups, and standards bodies and speak out about this need, the more awareness will be raised and that will (hopefully) lead to more action on behalf of the development community. Ultimately, it would be great to see sites like Google or Facebook offering alternative access methods just for their visually impaired users.
Idealism aside, I think it is productive to pursue other avenues like you mentioned with the CAPTCHA volunteer network, possibly even the development of something like OpenID for those with relevant disabilities as a universal form validation pass.
As for the technical aspect of your question, I don't think the availability of additional processing power alone will allow you to reliably and consistently break CAPTCHA. There is A LOT of money in spam, and you can be sure that shady SEO companies and Spammers alike have a great number of servers at their disposal. As Johannes Rössel mentioned, if you want to learn more about how this is done and where the technical difficulty lies, research Optical Character Recognition (OCR) and look at the wide variety of number/letter skewing that occurs on high traffic sites.

This related SO question has a number of good ideas in it, including a DEFCON talk that claims using multiple OCRs and voting breaks many simple CAPTCHAs. This suggests a candidate solution method: distribute the problem over several servers, each of which runs one or more OCR tools in parallel, collect the results, and take the most popular answer. Comments welcome.

Related

Online Classroom: VSat, Leased line or something else?

One of my friends is planning to set up a online classroom sort of environment and currently is evaluating the various ISP/Connection options he can have. Though he certainly needs a 100% up time of the internet connection, it can be compromised to like 99.X% for a good internet speed. Also since he is just starting up, 'price' too is a constraint but quality should not be compromised.
VSat link is one of the options that we know that might work out but I am very confused googling on the benifts of a VSat link as compared to a leased line. I feel a 2 MBPS leased line(may be 2) can suffice.
What should be the right connection? Any thoughts?

VSat is going to have a noticeable delay if you're doing anything interactive. It's also prone to weather (heavy storms, snow, can affect the quality of the link). Also, are you in a remote area or a developing country? If you're not, then you should just go with some business DSL line.

What is CAPTCHA for security purpose?

What does CAPTCHA do as far as security issue is concerned? Registration form of many sites have this field but how it works?

Completely Automated Public Turing Test To Tell Computers and Humans Apart
It prevents hackers from posting forms using automatic scripts, by requiring the user to input data read from images which are difficult to read automatically. The text can also be in the form of a sound, as per #BeRecursive's comments See this site.
It is used for logins as well as on other data entry forms. Here on Stack Overflow, if you edit answers or questions a number of times, you will be prompted before further edits are accepted.
There are two main forms. One has a single combination of characters that the user has to enter, the other, such as on SO has two.
The CAPCHA with two words usually consists of a word known to the Web Application and a second word that it is trying to decipher. See this site (thanks #Piskvor) The first word is used for validating the user and the answers to the second word are compared to other users' answers for that word and in this way the probable meaning of the text is determined. This is performed as a public service to organisations such as Libraries and Public Archives that are scanning large numbers of historical documents. The Optical Character Recognition (OCR) is not perfect and sometimes the meaning cannot be determined. So the word is made available in the CAPTCHA of a participating website and the meaning is determined. This process has no affect on the user of the website as it is only the first word that is used to determine whether they are a robot.

Its purpose is as a challenge-response test to demonstrate that the person using it is a human being and not an automated program. It doesn't really "secure" a website, it just makes it increasingly difficult for an automated system to access that functionality of the site. The idea is that some functions (such as posting a comment on a forum) should be done by real humans only and not automated processes.
This complexity can range wildly. There's the common "distorted text" CAPTCHA which requires the user to enter text displayed in an image designed to be difficult for a computer to read, but those are getting increasingly easier to beat with software. For accessibility purposes there are audio CAPTCHAs which play a short clip of a word and the user enters what they hear. I've even seen ones that ask simple questions that any reasonable person should be able to answer but may stump a computer that wasn't prepared for it. Some of my favorites are a matrix of pictures that say "click on the cat" or something else innocuous, which again a computer probably won't be able to do easily but a human would.
See Wikipedia: http://en.wikipedia.org/wiki/CAPTCHA
See Captcha.net: http://www.captcha.net/

A CAPTCHA is a "Completely Automated Public Turing test to tell Computers and Humans Apart". This basically means it is a simple test that makes it easy for a programmer to tell if a user is a computer or a person. It is usually visual and it relies on the fact that object recognition (including characters) is in its infancy at the moment. Recognizing letters is trivial for a human, however.
This ensures that the only users who will be able to fill in the form are those that can easily identify the objects on the CAPTCHA, usually characters. This is generally used to prevent automated form filling by bots (and to prevent spam)

it is an attempt to stop bots from registering on a site, it works by generating and image with text on it, the idea is that it very difficult (though apprently not impossible) to write a bot that can recognize the text within an image, this is also why the text is in wierd fonts (sometimes making it impossible for human, well me, to read!!)
here is a good link

CAPTCHA is just a riddle in the form of image or sound. "Stupid" bots can't solve the riddle and so they can't enter the correct answer to the riddle. If the correct answer is not entered, then there is no registration. Simple as that:)

Career day in kindergarten: how to demonstrate programming in 20 minutes? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Original Question
I was invited to the kindergarten group of my elder daughter to talk and answer the kids' questions about my profession. There are 26 kids of age 4-6 in the group, plus 3 teachers who are fairly scared of anything related to programming and IT themselves, but bold enough to learn new tricks. I would have about 20-30 minutes, without projector or anything. They have an old computer though, which by its look may be a 486, and I am not even sure if it's functioning (Update: it isn't).
My research turned up excellent earlier threads, with lots of good tips:
How would you explain your job to a 5-year old?
Career Day: how do I make “computer programmer” sound cool to 8 year olds?
What things can I teach a group of children about programming in one day?
My situation is different from each of the above though: the latter ones are concerned with older children, while the first one is about talking to a single kid (or elder person) — a group of 20 is a whole different challenge.
How can I teach the kids and their teachers about programming in a fun way?
Plan Based on Answers
Thanks for all the amazing answers, guys :-) I don't think it makes sense to accept a single answer, but I like Jim's the most, just as the majority of SOers apparently do. However, a lot of other answers contain useful hints and ideas (some of which I will surely use on future Career days in the school...).
I put together a rough plan:
Briefly explain what programming is, like in this answer.
Tell that computers are everywhere, and collect examples with the kids (as suggested in several answers below).
Do Jim's presentation with the sandwiches.
If time allows, build it further:
explain that the strength of computers is that they remember exactly what they are once taught (and demonstrate it by preparing a second sandwich, repeating all the faults of the first attempt)
have a second round trying to fix the bugs in the process
explain the concept of loops: you can make the computer prepare n sandwiches with a single instruction
This is my plan - I am pretty sure it will turn out very differently, so I will improvise according to the situation. The presentation is scheduled in about 2 weeks time - I will update the post afterwards and tell how it actually went...
Results
Finally the day of the presentation arrived today... in brief, all went fine and it was a huge success :-)
The group turned out to be quite restless and energetic this time, so the conversation occasionally went a bit chaotic. I had to cut it short and get to the Big Sandwich Maker Show. Just as Jim described, the kids loved it.
There was one unforeseen side effect though: after the first slice of bread finally got ready, everyone wanted to eat! So for a while - during which I tried to keep up the conversation and explain more about programming - we had to install a sort of emergency service line with the kindergarten teachers to produce immense amounts of marmalade bread and feed the hungry crowd (this was half an hour after breakfast, for the record :-). Then we ran out of bread, which clearly meant the end of the presentation. The biggest burst of laugh erupted when after cleaning up the mess, the kids noticed that the poor computer stepped on a patch of marmalade which ruined his sock :-)
The teachers themselves were also very positively impressed - judging from the feedback, this was the best and funniest Career day in this group so far. Thanks again to all of you for the great ideas!
Things that could be improved (next time):
When I asked "do you think computers are smart?", to my surprise most of them answered "no". I then asked who thinks computers are smart, and why. However I neglected to ask who thinks computers are dumb, and why - thus I think I missed some potentially intriguing answers.
Inviting the kids to come around the table got them actively involved... but maybe a bit too actively at times. Bread slices started to disappear from the table and some of the audience mimicked the computer as closely as dipping their own fingers into the butter and the marmalade :-) So it is better to keep some distance.
To keep the hungry crowd under control, the kids should be clearly told in advance: "you can eat all the bread, but only after the demonstration!"
But overall, I am quite happy with the outcome. And I am sure the kids got the core message: as a programmer, if you avoid creating a mess, you can make your bread (even with marmalade :-)

I've done this before.
I laid down a lot of paper towels on a table, and got out a loaf of (cheap) bread, a small tub of butter, a small jar of jelly, and a plastic butter knife.
I said to the kids, "How many of you think computers are smart?" Most of them raised their hands. I said, "Computers are really dumb. People are smart. You have to tell a computer everything. It doesn't know how to do anything. I'm going to show you what I mean. I'm going to pretend I'm as dumb as a computer, and you guys tell me how to make a sandwich."
And when the first kid said "open the bag of bread!" I ripped the bag apart and let bread fall randomly all over the table. That got a lot of giggles. I continued to take the kids literally at their words until they learned to give short, specific commands, and eventually we ended up with a butter and jelly sandwich. There was a lot of laughter but they came away understanding, at least a little, what a programmer does for a living.
(I should note, I've also done this demonstration with adults in an "intro to programming" class, and it works just as well with them.)

What about doing a kinesthetic version of Logo?
Say you have two kids side by side. Can they figure out how to switch places using only the commands Step Forward, Step Back, Turn Left 90 Degrees, and Turn Right 90 Degrees? I'm sure there are other games like going through a maze, etc.
I'd think you'd keep their attention if you can keep them moving. This will spark the interest. They'll figure out later that the job is sedentary. ;)

Don't try to show them anything on the computer. Watching someone else type is boring for adults. For 5-year-olds, it's a recipe for anarchy.
Instead, make it interactive. Some form of "Simon Says," but have them be the programmer.

I've never tried this, but it might be fun.
Physically demonstrate an algorithm by using some attribute of each kid as the input data.
For example, get them to form a line (in whatever order they go to initially), side by side. This might work better in a semi-circle so they can see each other doing the exercise, but there has to be a break in the line somewhere. Then, starting at one end of the line, get them to take turns doing "if the classmate on your left is taller than you, switch places; otherwise, stay put." The game ends when you go through the line and no one switches places. Get them to observe the results. (Hint: bubble sort!)

Make them write short programs for you to do simple things (like enter the room and take a seat) and then execute them literally to demonstrate the "bugs" -- where they were not specific enough or didn't take something into account, so that you will do things wrong. Try not to hurt yourself in the process. It should be funny and will get them a pretty good idea of what an algorithm is.

To turn the kids onto programming, you drive up to the kindergarten in your Rolls Royce and walk in with your gorgeous significant other.
If you're not Bill Gates, then you'll just have to explain that you sit in boring meetings for 4 hours a day, print cover sheets for TPS reports for 2 hours, and stare at stupid stuff written by preceding clueless programmers for the other 6 hours. (No need to mention that then you field calls from people who are maintaining your last program and who think YOU are the preceding clueless guy).
No, i'm not bitter, why do you ask?
Seriously, (I am sure I'm plagiarizing from one of those 3 threads subconsciously), have them play "give instructions to me on how to do Y", with you doing things the Genie way - all wrong unless instructions are very precise and clear. Actually mention genie as good example assuming the kids saw Aladdin.
;^)

I think you could do the following demonstration in 20 minutes. Maybe it's more suited for older children. I don't really know what kindergarteners are capable of. I'd personally avoid trying to explain programming, and instead describe a problem that we as programmers solve. For example, if there are enough children, you can demonstrate the Internet to them interactively.
Part I: How it Works
First describe to them, preferably with props, how the Internet works. Bring in a laptop connected by a cable (for visual effect) to a home router. Tell how computer programmers make all sorts of devices, including the programs on the laptop, the program in the router, and applications in other devices connected to the Internet, like cell phones.
Explain how computers aren't connected directly to each other because it's impossible to connect a cable from every computer in the world to every computer. You'd need a billion cables in your house. So instead, computers connect to routers. And routers give packets of data (for example, e-mails, pictures, or videos) to other routers until it finally gets to the other computer.
Describe the rules for a computer to talk to another:
A computer can only give a packet to its router.
A router can give a packet to the computers connected to it, or to the nearest router.
This explanation should be very short, but emphasize the rules. You should probably equate packets with e-mail or pictures.
Part II: Interactive Time
Then have 3 children volunteer to be routers. Everyone else is a computer and divide them up evenly. It'd help to have colored cards they can hold. Like the person holding the dark blue card is router that can talk to all the people holding light blue cards. Let's say you give out blue, red, and yellow cards.
Arrange the "routers" in a line, blue, then red, then yellow. The blue router will then have to give a packet to the red router to give it to the yellow router. Group the other kids around their routers.
Bring "packets" for each child. Mix it up with photos, letters, a print-out of tic-tac-toe to symbolize a game, or whatever. Start by having a single red computer send to a yellow computer.
"Ashley, pick a yellow computer that you want to send your picture to. OK, to send the picture to Brian, you have to give it to your router, Kelly. Tell Kelley who should get the picture. Kelley, you are blue, so you can't give the picture to Brian. You have to give it to Timmy. Tell Timmy who should get the picture. Timmy is red, so he can't give it to Brian. He has to give it to Renee. Renee, you can give the picture to Brian since he is a yellow computer and you are the yellow router."
Then have everyone think of one person to send their "packet" to, and watch your impromptu network in action.
Part III: Relate back to computer programming
To conclude, ask the routers whether it was easy to be a router, or hard because there was a lot of people trying to give you pictures at one time. Point out where things went wrong and tie it into real problems that we solve.
"I could see that Timmy was overloaded with packets because everyone had to send their packet through him. As computer programmers, we have to solve problems like this every day. One way we could solve it is to give Timmy 4 arms. Or maybe add another router so that if Timmy has too many packets to deliver, you could give it to a different router instead." Or "Maybe we want pictures to be delievered faster, so we could ask the router to deliver the picture first before delivering any other packets."

To kind of borrow from the other ideas already posted, a game of Simon Says may be the way to go. However, you can stress how computers will do EXACTLY what you tell them to do. So, if the kids are Simon, and they say, "Simon says sit down." then you just sit down on the floor (not in a nearby chair or anything). Follow instructions to the letter and not to the spirit. (Of course, this may be tricky getting the kids to give ambiguous instructions, but I'm sure you can come up with something.)
Other than that, you could also talk about video games or other computer "things" that the kids may have used and you can say that programmers, like yourself, create those. And then maybe jump into the Simon Says to show how it works. Of course, this could result in a bunch of kids growing up thinking that you spend your entire day at work playing Simon Says with a computer...

I sometimes regard my job as playing with Lego bricks. You start with a set of bricks of different sizes, shapes and colors, and from that you build larger things. You can build castles or star wars robots using the same set of bricks.
And, it's about the same amount of fun!

One of the major perks of programming is the ability to create things. To make dreams come true. I don’t think this will appeal very much to small children who have no problem to let their imagination roam free anyway. What do computers bring to the table?
Instead, you could probably interest them in problem-solving, puzzles. The kind of thinking that is needed for programming. I probably wouldn’t use a computer at all; instead, let them solve an engaging mathematical puzzle. It doesn’t have to be hard but it should involve creative thinking.

When I try to explain programming in a short amount of time to people who aren't familiar with programming, I explain it using Legos. With Legos you have a bunch of simple pieces, this is like the programming language. Then you can piece them together however you want and make anything that you can imagine as long as you have the correct pieces.
To adults and kid this is likely to be a very interesting analogy and it still demonstrates the concept of programming.
Also, you could even build a Lego car poorly, then also display a Lego car with very nice design, and show them that programming is just like this. You can program cars or robots or whatever you can imagine, but there's not only one way to do it, there are many ways to do it. some better than others.
I have gotten so many people to begin programming and even switch their majors with this analogy. :)

I think I'd begin by talking for 2-3 minutes about computers, and that they follow instructions about what to do.
Then I'd demonstrate with a prebuilt LEGO Mindstorms robot and program it a couple of times and run it, just to show them that it follows the program. Mindstorms programming is pretty visual and simple to grasp.
Finally I'd try to explain that there are computers running programs almost everywhere, even in traffic lights, microwave ovens and their favourite toys.

Talk about how pervasive computer programming is - it guides airlines, phones, cars, how you buy your tickets online etc.
Then teach them to write a simple program symbolically -
1.Draw a grid on the blackboard.
2.Draw cheese at one end, and a mouse at the other end.
3.Have them "program" the moues to get the cheese!
Walk them through their failed attempts as a class, maybe have the mouse fall in traps or something in the grid. They would get a thrill out of it.

How to teach kids what programming is?
Well, the first step is likely to get some cows involved!
Download a simple programming game (like IQ Marathon) onto the laptop and hook that up to a projector. While you're doing this you can talk about how being a programmer often means working with recent technology (and thereby giving a demonstration of you doing so).
Once you've got it set up (practice so you can make it work in 5 minutes or less), you can use the game to show very visually (and with cows!) how the computer only does exactly what you tell it to, and how you (the programmer) have to figure out what instructions are necessary to make it do what you want. When you get it right, everybody is so happy about your success that there are dancing cows!
From there you can answer any questions, or perhaps just let the kids try and figure out how to program cows themselves. Wherever they want to go, really.
Cows!

Give each child a cut out shape; circles, squares, triangles, different colors etc. Explain how programming is giving instructions in specific order. Hold up a picture of a smiley face and walk the kids through how to construct it. Yellow circle, black dot, black dot, arc. Then show a more complicated picture, and have the kids come up in order based on your instructions. You can even make a mistake (like putting the yellow circle over the black dots) to show how 'Bugs' creep into a program.

Demonstrate a simple lego mindstorm robot and its corresponding flow chart. You wont have to show then any code and they can see the end result of your logic by watching the lego execute your program.

Kids likes things that "do something" and flashing lights.
For my sons birthday, I made a safe (box with electric lock and lots of leds) that was connected with the PC.
They had some questions to answer, and each response resulted in flashing leds (green for good answers and red for wrong answers). If they answered enough questions right, the leds started a simple animation which ended with a loud "clonk". The safe was now open and they could collect their rewards.
It was fun to build and the kids loved it.

Sell them on the value of unattended automation. Have a kid walk to the front of the room and show the class what he does each night when he's brushing his teeth. Then have that same kid show you what he'd be doing during that time if he didn't have to brush his teeth.
Then tell that kid that you know how to move that brush across his teeth while he's doing that other thing that he'd rather be doing, and tell him he'll never even feel it. His teeth will just magically be clean next time his mother goes to inspect them.
Then maybe write some pseudocode on the chalk board that shows the Brush API accessing the Tooth resource in a background thread behind the Favorite activity.

How to present your code to potential buyers?

I'll do a demo of my code to slightly non-technical audience, and I need to show them what I've got in my project (about 15K lines of code). I'm trying to convince them that I've spend time on the project and it's in a good state.
These guys planning to invest money into this product. Therefore I should convince them that this app worth the price that they are going to spend and justify the time I've spent, secondly they should see that this is something takes time and I know what I'm doing (basically I need to win their trust) .
What metrics I can use other than "lines of code"? (Maybe lines of comment?)
What are the best tools (preferably free) to generate a report from .NET Projects?
UPDATE :
Also a way to provide "project cost - cocomo" would be cool, like this one :
FOUND:
http://www.cms4site.ru/utility.php?utility=cocomoii will help you to calculate an estimated cost for your project.

If they're non-technical, it won't matter. It will be like trying to sell a high-end bike to people who don't know a bike from a car. 15k lines of code won't matter to them any more than 300k lines of code will.
You need to find something other than the actual code to wow them with.
Can you code up some demos and tell them how short time it will take them to build similar applications with your code? Like "If you use my code, you can build this multimedia application in 15 minutes without writing more than a few lines of code". Non-technical people generally love saving time and money.
It probably depends on how "slightly" they are in the non-technical department.

An investor only cares about money. Investors start at the exit and work backwards. Knowing this, pitch your project in terms of the return they will get in their investment.
Key points would include:
Your expertise: Do you know the market you want to sell in to? Are you leveraging your expertise in some way to make the project a reality?
Risk: Using your already existing code base lowers risk in terms of both time and money. They will probably do technical due diligence to validate your claims, so be honest here.
Time to Market: Having a code base in place will reduce their time to market, which may be significant.
Vision: They need to know that there is a future for your product. This is your chance to get them excited!
Investment is about the future, not the past, so understand that you need to achieve what you are promising. The path you trod to get to where you are now may be interesting, but largely irrelevant to the investor. What I'm trying to say is sell the vision, not where you are now or where you've been.
Good luck and hope you get what you need!

It's not clear to me from your question whether you're talking about people who would buy the use of your product or ownership of your product.
In either case, ask yourself these questions:
"What problem(s) does this product solve for my users, from their point of view?"
"What does this product let the users do, that they already want to do, but can't do without it?"
"What does this product let the users do, that they already want to do, but can't do as easily without it?"
Features don't matter. Menus and dialogs don't matter (unless they require explanation, in which case they matter in the negative sense).
If you want numbers that interest a potential buyer of (an instance of) the product, talk in terms of how much time or money the buyer can save by using your product.
If you want numbers that interest a potential buyer of shares in your company or product, talk in terms of the size of the market, how you've analyzed that market's needs, and the ROI of any investment.

I've had success showing potential customers our automated build cycle, in slideshow form. I took them through our "production line" as if it was a factory tour, and showed the nice colored bars of coverage reports, uptilted lines of historical lines of code, pie charts of breakdowns of lines of code per module.
Then I did the same for everything aroung the actual building. So there's a requirements pipeline where they are involved, and a test/validation cycle where they are again involved.
It may not mean anything to them, but it shows them you have control over your process, and control over the quality of the delivered end product.
Please note that although people may be non-technical, try to be as honest as possible. As soon as they discover one single tiny lie in your story, you're lost. And chances are that there's that one technical guy in the back who can ask that one question which makes your house of cards fall down.
Happy sales!

"good code" doesn't matter unless you are demonstrating the medium and long term advantages of it - enhanced flexibility, simplicity, which saves customer time/money while adding agility.

I think explaining the more complex aspects of the code and the work that went into it to any audience will help show how much work and effort have gone into a project.
Hours spent coding could be a good metric to give them.

Talk about the features. Explain what you have working or almost working. Go at it from what they are interested in.
Try to show them visuals that they care about if you can. I think a few minutes doodling on a board would be better than showing lines of code.

The only thing that is likely to matter to a buyer (particularly a non-technical one)is functionality. I would concentrate on selling the features. You might consider discussion how you have tested it to verify that it performs as you claim.

I wouldn't use code per se, since a non-techie wouldn't understand it. Boasting about quantity is probably meaningless (how does a non-techie know that a 1MLOC project is significant? As for quality, you can present, e.g., maintainability metrics, test coverage, things like that. Feel free to show off your excellent toolchain too (continuous integration and all that), your mastery of various performance-testing tools. Also, showing things like Workflow Foundation helps - customers like to see how their business processes can be turned directly into code with a diagram notation.

EDIT modified to reflect OP's clarification (in comment here) that these potential buyers are looking to re-sell the software
Re-sellers are going to be looking for three things:
Is anyone going to make something better, cheaper or more quickly?
Is this guy going to be able to use our investment effectively to produce more?
Can we sell what this guy has produced, and will produce?
How to address points 1 and 2 have been very well addressed in other answers, but it's question 3 which is the hardest to prove for us techie people. It's also extremely important - if you can go to these buyers and hand them 3 killer benefits which they can repeat with more flair and Powerpoint when they're doing their sales calls, you'll be off to a good start :)
The main thing you have to do is to take a step back from your work and look at the:
features: what does it do
advantages: why is it better
benefits: why should the customer care
Features are closest to what you care about as a developer, but are pretty much irrelevant to non-technical buyers. Advantages are an essential step in understanding your competition and the customers' alternatives.
By putting features and advantages together, you can hit the customer with a number of benefits, e.g.:
using my software will save you $0.01 per transaction, or $40,000 p.a.
my software will increase customer retention by 5%
your admins will need 15% less time to deploy changes using my software
These are the things that customers care about: what's going to be good for the company, and good for them.
To be brutally honest: the end customer don't care how much effort you put into it (LoC or any other metric), they don't care how well it's written (comments, tests, any other metric), they don't care how hard a problem it is to solve, they don't care about features.
Their only requirement is that it will save them time / effort / money. You know that how hard you've worked to solve the problem, and solve it well, is key to their requirement, but it's secondary. You need to make it perfectly clear why them buying their stuff will mean they'll get promoted.

For COCOMO - Project Cost Estimation
I found this website, it's kind of a manual process but it'll do.
http://www.cms4site.ru/utility.php?utility=cocomoii

Developing a online exam application, how do I prevent cheaters?

I have the task of developing an online examination software for a small university, I need to implement measures to prevent cheating...
What are your ideas on how to do this?
I would like to possibly disable all IE / firefox tabs, or some how log internet activity so I know if they are googling anwsers...is there any realistic way to do such things from a flex / web application?

Simply put, no there is no realistic way to accomplish this if it is an online exam (assuming they are using their own computers to take the exam).

Is opening the browser window full screen an option? You could possibly also check for the window losing focus and start a timer that stops the test after some small period of time.

#Chuck - a good idea.
If the test was created in Flash/Flex, you could force the user to make the application fullscreen in order to start the test (fullscreen mode has to be user-initiated). Then, you can listen for the Flash events dispatched when flash exits fullscreen mode and take whatever appropriate action you want (end the test, penalize the user, etc.).
Flash/Flex fullscreen event info.
blog.flexexamples.com has an example fo creating a fullscreen-capable app.

Random questions and large banks of questions help. Randomizing even the same question (say changing the numbers, and calculating the result) helps too. None of these will prevent cheating though.
In the first case, if the pool is large enough, so that no two students get the same question, all that means is that students will compile a list of questions over the course of several semesters. (It is also a ton of work for the professors to come up with so many questions, I've had to do it as a TA it is not fun.)
In the second case, all you need is one smart student to solve the general case, and all the rest just take that answer and plug in the values.
Online review systems work well with either of these strategies (no benefit in cheating.) Online tests? They won't work.
Finally, as for preventing googling... good luck. Even if your application could completely lock down the machine. The user could always run a VM or a second machine and do whatever they want.

My school has always had a download link for the Lockdown browser, but I've never taken a course that required it. You can probably force the student to use it with a user agent check, but it could probably be spoofed with some effort.
Proctored tests are the only way to prevent someone from cheating. All the other methods might make it hard enough to not be worth the effort for most, but don't discount the fact that certain types of people will work twice as hard to cheat than it would have taken them to study honestly.

Since you can't block them from using google, you've got to make sure they don't have time to google. Put the questions in images so they can't copy and paste (randomize the image names each time they are displayed).
Make the question longer (100 words or more) and you will find that people would rather answer the question than retype the whole thing in google.
Give them a very short time. like 30-45 seconds. Time to read the question, think for a moment, and click either A, B, C, D, E,
(having just graduated from CSUN I can tell you scantron tests work.)
For essay questions? do a reverse google lookup (meaning put their answer into google as soon as they click submit) and see if you get exact matches. If so, you know what to do.

Will they always take the test on test machines, or will they be able to take the test from any machine on the network? If it will be specific machines, just use the hosts file to prevent them from getting out to the web.
If it is any machine, then I would look at having the testing backend change the firewall rules for the machine the test is running on so the machine cannot get out to the interwebs.

I'd probably implement a simple winforms (or WPF) app that hosts a browser control in it -- which is locked in to your site. Then you can remove links to browsers and lock down the workstations so that all they can open is your app.
This assumes you have control over the workstations on which the students are taking the tests, of course.

As a teacher, I can tell you the single best way would be to have human review of the answers. A person can sense copy/paste or an answer that doesn't make sense given the context of the course, expected knowledge level of the students, content of the textbook, etc, etc, etc.
A computer can do things like check for statistical similarity of answers, but you really need a person for final review (or, alternatively, build a massive statistical-processing, AI stack that will cost 10x the cost of human review and won't be as good ;-))

No, browsers are designed to limit the amount of damage a website or application can do to the system. You might be able to accomplish your goals through Java, an activex control, or a custom plugin, but other than that you aren't going to be able to 'watch' what they're doing on their system, much less control it. (Think if you could! I could put a spy on this webpage, and if you have it open I get to see what other websites you have open?)
Even if you could do this, using a browser inside a VM would give them the ability to use one computer to browse during the test, and if you could fix that they could simply use a library computer with their laptop next to it, or read things from a book.
The reality is that such unmonitored tests either have to be considered "open book" or "honor" tests. You must design the test questions in such a manner that references won't help solve the problems, which also means that each student needs to get a slightly different test so there is no way for them to collude and generate a key.
You have to develop an application that runs on their computer, but even then you can't solve the VM problem easily, and cannot solve the side by side computers or book problem at all.
-Adam

Randomize questions, ask a random set of questions from a large bank... time the answers...
Unless you mean hacking your site, which is a different question.

Short of having the application run completely on the user's machine, I do not believe there is a way to make sure they are not google-ing the answers. Even then it would be difficult to check for all possible loop-holes.
I have taken classes that used web based quiz software and used to work for a small college as well. For basic cheating prevention I would say randomize the questions.

Try adding SMS messages into the mix.

I agree with Adam, that even with the limitations that I suggested, it would still be trivial to cheat. Those were just "best effort" suggestions.

Your only hopes are a strong school honor code and human proctoring of the room where the test is being given.
As many other posters have said, you can't control the student's computer, and you certainly can't keep them from using a second computer or an iPhone along side the one being used for the test -- note that an iPhone (or other cellular device) can bypass any DNS or firewall on the network, since it uses the cellular provider's network, not the college's.
Good luck; you're going to need it.

Ban them from using any wireless device or laptop and keylog the machines?

You could enforce a small time window during which the test is available. This could reduce the chance that a student who knows the answers will be free to help one who doesn't (since they both need to be taking the test at the same time).
If it's math-related, use different numbers for different students. In general, try to have different questions for different copies of the test.
If you get to design the entire course: try to have some online homeworks as well, so that you can build a profile for each student, such as a statistical analysis of how often they use certain common words and punctuations. Some students use semi-colons often; others never, for example. When they take the test you get a good idea of whether or not it's really them typing.
You could also ask a couple questions you know they don't know. For example, list 10 questions and say they must answer any 6 out of the 10. But make 3 of the questions based on materials not taught in class. If they choose 2 or 3 of these, you have good reason to be suspicious.
Finally, use an algorithm to compare for similar answers. Do a simple hash to get rid of small changes. For example, hash an answer to a list of lower-cased 3-grams (3 words in a row), alphabetize it, and then look for many collisions between different users. This may sound like an obvious technique, but as a teacher I can assure you this will catch a surprising number of cheaters.
Sadly, the real trouble is to actually enforce punishment against cheaters. At the colleges where I have taught, if a student objects to your punishment (such as flunking them on the test in question), the administration will usually give the student something back, such as a positive grade change. I guess this is because the student('s parents) have paid the university a lot of money, but it is still very frustrating as a teacher.

The full screen suggestions are quite limited in their effectiveness as the user can always use a second computer or w/ multi monitor a second screen to perform their lookups. In the end it is probably better to just assume the students are going to cheat and then not count online tests for anything important.
If the tests are helpful for the students they will then do better on the final / mid term exams that are proctored in a controlled setting. Otherwise, why have them in the first place...

Make the questions and answers jpeg images so that you cannot copy and paste blocks of text into a search engine or IDE (if it is a coding test). This combined with a tight time limit to answer each question, say three minutes, makes it much harder to cheat.

I second what Guy said. We also created a Flex based examination system which was hosted in a custom browser built in .NET. The custom browser launched fullscreen, all toolbars were hidden and shortcuts were disabled.
Here is tutorial on how to create a custom browser with C# and VB.NET.

This will solve your problem. http://www.neuber.com/usermonitor/index.html
This will allow you to view the student's browser history during and after the test as well as look in on their screen during the test. Any urls visited during test time will be logged, so you can show them the log when you put a big F on their report card. :)

No one can stop people from cheating, but everyone can receive different questions altogether.
I prefer you buy available online scripts in market as starting point for it. This will save you time, cost and testing efforts.
Below is one of the fine scripts that I worked with and it worked like charm. Using this as base I developed a online testing portal of over 1000 users using computer adaptive test.
http://codecanyon.net/item/online-skills-assessment/9379895
It is a good starting point for people looking to develop Online Exam System.
I customized the script with the help of their support.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex