Is web scraping allowed? [closed] - web-scraping

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm working on a project that requires certain statistics from another website, and I've created an HTML scraper that gets this data every 15 minutes, automatically. However, I stopped the bot now, as in their terms of use, they mention they do not allow it.
I really want to respect this, and especially if there's a law prohibiting me from taking this data, but I've been contacting them through email several times without a single answer, so now I've come to the conclusion that I'll simply grab the data, if it is legal.
On certain forums I've read that it IS legal, but I would much rather get a more "precise" answer here on StackOverflow.
And let's say that this is in fact not illegal, would they have any software to spot my bot making several connections every 15 minutes?
Also, when talking about taking their data, we're talking about a single number for each "team", and this number I will transfer in to our own number.

I'll quote Pablo Hoffman's (Scrapinghub co-founder) answer to "What is the legality of web scraping?", I found on other site:
First things first: I am not a lawyer and these comments are solely
based on my experience working at Scrapinghub, please seek legal
assistance accordingly.
Here are a few things to consider when scraping public data from websites (note that the following addresses only US law):
As long as they don't crawl at a disruptive rate, scrapers do not breach any contract (in the form of terms of use) or commit a crime
(as defined in the Computer Fraud and Abuse Act).
Website's user agreement is not enforceable as a browsewrap agreement because companies do not provide sufficient notice of the
terms to site visitors.
Scrapers accesses website data as a visitor,
and by following paths similar to a search engine. This can be done
without registering as a user (and explicitly accepting any terms).
In Nguyen v. Barnes & Noble, Inc. the courts ruled that simply placing a
link to a terms of use at the bottom of webpage is not sufficient to
"give rise to constructive notice." In other words, there is nothing
on a public page that would imply that merely accessing the
information is subject to any contractual terms. Scrapers gives
neither explicit nor implicit assent to any agreement, therefore
breaches no contract.
Social networks, for example, assign the value of becoming a user (based on call-to-action on public page), as the ability to: i) Gain access to full profiles, ii) Identify common friends/connections, iii) Get introduced to others, and iv) Contact members directly. As long as scrapers makes no attempt to perform any of these actions they do not gain "unauthorized access" to their services and thus does not violate CFAA
A thorough evaluation of the legal issues involved can be seen here: http://www.bna.com/legal-issues-raised-by-the-use-of-web-crawling-and-scraping-tools-for-analytics-purposes

There must be robots.txt file in root folder of that site.
There are specified paths, that are forbidden to harass with scrappers, and those, which is allowed (with acceptable timeouts specified).
If that file doesn't exists - anything is allowed, and you take no responsibility for website owners fail to provide that info.
Also, here you can find some explanation about robots exclusion standard.

Related

BDD Story style [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
We using Behaviour Driven Development to develop a SOA system using Scrum and have come across two approaches to producing the stories.
Approach 1
Given Specific Message Type is available
And Specific State exists
When the Message is processed
Then expected resulting state exists
Approach 2
Given a Specific state exists
When Specific Message Type is processed
Then expected resulting state exists
Few if any of the examples available are applied to testing SOA systems. I would appreciate any experiences of these or any insights on the consequences of each approach.
We are aiming for declarative rather than imperative stories. The message arrival in the first has a slightly imperative feel but I'm not confident the second approach covers acceptance criteria adequately, because it doesn't seem to account for the event driven nature of the SUT.
The aim of the story is to communicate with your customer, so whatever style promotes that goal is best - and that will vary from one team to another. I might prefer 'when some business event occurs' rather than your suggestions, but I don't know your team! Beware of trying to find a 'one-size-fits-all' template, use whatever communicates best for each situation. And the heart of agile is the ability to adapt - try one style and feel free to adapt if it doesn't seem to be working.
Whenever I'm producing a library or service, I often find it useful to provide an example of the kind of scenario which a service user might want. So for instance:
Given the server has information about risk limits for Lieumoney Brothers
And we are $2m from those limits
When we process EOD orders that take us to $1m for those limits
Then our status with Lieumoney Brothers should be Amber.
The actual contents of the message can then reflect the interaction with those particular sums and that particular counterparty. You can use this for lots of different domains, and your approach will depend on the domain and whether the availability of a message is unusual, for that domain. In the above example where you're trading millions then having risk information for a particular counterparty might be valuable, and if that's the state, it's worth calling out separately. It's probably less important if you're buying baby rabbits, for example.
Given "Rotweiller Pets Limited" is trading baby rabbits $2 cheaper than anyone else
When we ask the system to order 15 baby rabbits
Then it should place an order with "Rotweiller Pets Limited".
It's hard to discuss this without specific examples. However, you can probably see how providing those scenarios would then act as documentation for how to use your APIs, even if the underlying automation for those scenarios talks directly to the API, and has nothing actually specific for pets or Lieumoney trades.

Should design tasks be user stories? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm trying to figure out when using user stories is appropriate. Always or not?
For an example, think about a team starting to work on something from scratch, say a movie ticket reservation service. It's easy to come up with user stories for the functionality, like:
"As an end-user I want to be able to browse the movies running in theater X" and so on.
But before those can be implemented, the system needs to be designed: Architecture must be designed, database must be designed, technologies chosen for the GUI and business logic.
How should these tasks appear in the backlog? Should they be user stories as well? If so, how do they comply with the INVEST mnemonic? They don't alone deliver anything for the end-user, but nevertheless they are needed before any feature can be implemented.
But before those can be implemented, the system needs to be designed: Architecture must be designed, database must be designed, technologies chosen for the GUI and business logic.
Not really agree with it. Since a story is a feature which takes almost every layer of your architecture implementing the story builds up the architecture same time. Check up Alistair Cockburn's Walking Skeleton definition.
About the question
Most of the stories you may define as "As a user..." as a feature the story may has UI work as well. So to make it clear you may split up the story into subtasks.
Although some work would be hard to present in INVEST user stories. For instance bugs, tech. dept and so on. They still be presented as stories of a special type(Bugs, tech stories). you couldn't show them on Demo however you may mention about.
(...) before those can be implemented, the system needs to be designed: Architecture must be designed, database must be designed, technologies chosen for the GUI and business logic. (...)
Not exactly. E.g., you don't need to get the entire database designed for implementing functionalities for a sprint, a specific release or whatever given time. What you may need is some common ground.
This is where one of the Agile's beauties lives (vs. waterfall), welcoming change.
Now, answering your question: realize that the role in a user story is not necessary a role of the end customer. Could be your developers, your sysadmins, etc. As such:
AS A server administrator,
I WANT to upgrade our webserver
SO THAT it will handle better the memory consumption
So, you could ask convince your P.O. to add or prioritize in the backlog an user story (or several) for building up some ground for the future development. But, again, when creating such stories remember the Agile value of Responding to change.
P.S.
It's also important to keep the Product Backlog clear and accessible, and provide properly interaction between P.O. and Development Team. This should be guided by the Scrum Master.
This way the team could give better feedback/warn the P.O., in a technical perspective, how one story affect each other and why should story X should be done before story Y.

Scrum stories and behind the scenes features [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
As I understand things, the Scrum backlog is composed of a series of Stories that represent something for the end user and this is further decomposed into Features.
If this is the case, where does all the behind the scenes features go that aren't really linked to a story but are still useful?
For example, say I'm making an application that catalogs the contents of a hard drive. A story wouldn't require it but having an md5 hash on each file would be a nice feature for flagging duplicates.
The classic template to write good stories is: "As a <role>, I want to <action> so that <business value>" (or variations around this) and a story should indeed provide business value. Why? Well, if a story does not communicate the business value it generates, how could a (very likely non technical) Product Owner evaluate its importance and prioritize it accordingly? Writing good stories increases your chances to get them rated as important and thus implemented.
A great tool to find good business value is the 5 Whys (which is used for root cause analysis, i.e. finding the root cause of a problem). The cucumber documentation explains very well how to use it to find some "good" business value and has a very good sample, so, instead of paraphrasing it, I'm quoting the explanation below:
Business value and MMF
You should discuss the "In order to"
part of the feature and pop the why
stack max 5 times (ask why
recursively) until you end up with one
of the following business values:
Protect revenue
Increase revenue
Manage cost
Increase brand value
Make the product remarkable
Provide more value to your customers
If you’re about to implement a feature
that doesn’t support one of those
values, chances are you’re about to
implement a non-valuable feature.
Consider tossing it altogether or
pushing it down in your backlog. Focus
on implementing the MMFs (Minimum
Marketable Features) that will
yield the most value.
Here is an example taken from an IRC
chat session in #cucumber:
[5:08pm] Luis_Byclosure: I'm having problems applying the "5 Why" rule, to the feature
"login" (imagine an application like youtube)
[5:08pm] Luis_Byclosure: how do you explain the business value of the feature "login"?
[5:09pm] Luis_Byclosure: In order to be recognized among other people, I want to login
in the application (?)
[5:09pm] Luis_Byclosure: why do I want to be recognized among other people?
[5:11pm] aslakhellesoy: Why do people have to log in?
[5:12pm] Luis_Byclosure: I dunno... why?
[5:12pm] aslakhellesoy: I'm asking you
[5:13pm] aslakhellesoy: Why have you decided login is needed?
[5:13pm] Luis_Byclosure: identify users
[5:14pm] aslakhellesoy: Why do you have to identify users?
[5:14pm] Luis_Byclosure: maybe because people like to know who is
publishing what
[5:15pm] aslakhellesoy: Why would anyone want to know who's publishing what?
[5:17pm] Luis_Byclosure: because if people feel that that content belongs
to someone, then the content is trustworthy
[5:17pm] aslakhellesoy: Why does content have to appear trustworthy?
[5:20pm] Luis_Byclosure: Trustworthy makes people interested in the content and
consequently in the website
[5:20pm] Luis_Byclosure: Why do I want to get people interested in the website?
[5:20pm] aslakhellesoy: :-)
[5:21pm] aslakhellesoy: Are you selling something there? Or is it just for fun?
[5:21pm] Luis_Byclosure: Because more traffic means more money in ads
[5:21pm] aslakhellesoy: There you go!
[5:22pm] Luis_Byclosure: Why do I want to get more money in ads? Because I want to increase
de revenues.
[5:22pm] Luis_Byclosure: And this is the end, right?
[5:23pm] aslakhellesoy: In order to drive more people to the website and earn more admoney,
authors should have to login,
so that the content can be displayed with the author and appear
more trustworthy.
[5:23pm] aslakhellesoy: Does that make any sense?
[5:25pm] Luis_Byclosure: Yes, I think so
[5:26pm] aslakhellesoy: It's easier when you have someone clueless (like me) to ask the
stupid why questions
[5:26pm] aslakhellesoy: Now I know why you want login
[5:26pm] Luis_Byclosure: but it is difficult to find the reason for everything
[5:26pm] aslakhellesoy: And if I was the customer I am in better shape to prioritise this
feature among others
[5:29pm] Luis_Byclosure: true!
So, let me start: why would it be nice to have a md5 hash on each file (which, expressed as you did, is an implementation detail and doesn't communicate any business value)?
There is no "scrum" backlog, only
Product Backlog by the product owner that has Business Values
and
Sprint Backlog by the scrumaster/developpers which list tasks traced back to a story.
I am updating for precising the distinction between a Vision Document and a Product Backlog as for Business Value:
Business Vision Document (Strategic Level) are all about Business Value as well as Product Backlog. But Product Backlog is equivalent to Functional Specifications in other traditional methodologies that is it is something CONCRETE or OPERATIONNALLY directly implementable by the team not just a VISION from a high level managing director.
Of course product backlog itself should be tracable to Vision Document Items.
At the end of the day, agile is about doing what works for you to be productive. These kind of answers are for you to decide what works.
It may just be an implementation detail of another story, or it may be a story unto itself.
What ever makes your group most productive is what it should be.
I would place them by something like:
"Non user-stories" or "NUS"
"Programmers Only" or "PO"
"Behind the sences" or "BTS"
Followed by a short description of the feature.
So:
BTS: catalog filesystem
PO: find file type with magic bytes
Strange! I'm making the same application! :-)
Update:
So, I read wiki, I think we need an extra log (the Sprint backlog).
Wiki says:
Sprint backlog
The sprint backlog is a document containing information about how the team is going to implement the features for the upcoming sprint. Features are broken down into tasks; as a best practice, tasks are normally estimated between four and sixteen hours of work. With this level of detail the whole team understands exactly what to do, and anyone can potentially pick a task from the list.

Scrum and requirements [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
You can't just have user stories somehow the functionality of the program has to be documented. Do you end up with a specifications document with scrum? If you do do you end up assigning time to do this onto the task?
An example would be a complex workflow.
Another example would be a new member who comes onto the team.
There will be plenty of good ideas added to this question. My personal experience has taught me that:
1~ The working product is a form of documentation itself: assuming the product is accepted, then asking what it should do under certain condition is equivalent to asking what does it actually do under those conditions - log in and try it to get your answer.
2~ The tests, be them manual or automated (or a mix), are a form of documentation. Certainly unit tests may be way too far from the domain language spoken by the less-technically inclined team members (eg: 'business Experts', or Customers). Acceptance tests may be closer to a 'middle ground' of sort. Definitely BDD-style tests seem to have the best chance to build a ubiquitous language everyone can understand (see in this regard Gojko's Bridging the Communication Gap). Nonetheless, all of these form of tests are a form of documentation which can be used to determine what the product should do.
3~ Depending on where your project falls on the spectrum, your documentation (and, in general, all your ancillary artifacts) may require a higher or lower degree of ceremony. Smaller products, smaller teams, where time to market is critical may find that a very formal documentation of requirement costs way too much compared to the value it adds. Extremely large projects, spanning multiple teams and years of development, on the other hand, will find the ROI of formal documentation quite different.
4~ In the perfect world, we probably wouldn't need to document requirements other than in the form of working code (which, in the ivory Tower would also be self-explanatory) and tests (mostly for regression testing, and -on the fringe- to drive development of new features). Thus, the question of requirements documentation is a question about what's different between the Perfect World/Ivory Tower and the Real World/Trenches. The answer, of course, is different on a case-by-case, depending on the project and the team. For instance, we could say "All requirements shall be kept into this wiki, and maintained with the utmost care, etc etc..." but if the team is not familiar and comfortable with wikis this would not work.
5~ In the end, the stakeholders are those you should ask. Certainly, the topic should be approached in a collaborative manner, because everyone on the team will have to interact with the requirements throughout the project, but you must find a form of documentation that satisfy the stakeholders' needs.
All that being said, here's some places I've seen requirements documented while applying Scrum (why do I feel like this word should always be followed by an asterisk?):
PDF document
Bulletin Boards
Wiki
Wiki + Automated Acceptance Tests (read: FitNesse)
Unit tests
Manual Test Plans
User Stories, Use Cases diagrams (read: Enterprise Architect models)
Whiteboards around the office
Emails
Post-it notes
And, to be honest, I cannot say that any one system has a consistently higher correlation with a successful project than the others. I guess, indeed, we don't have a silver bullet.
HTH, thanks for the thought-provoking question!
Adding "documentation" as a task on each user story could certainly go a long ways towards your goal.
Scrum says you should document what you need, when you need it; it doesnt say you shouldnt have documents.
So if a document is required either by the finished product (eg. help documentation) or to produce the finished product (eg. requirements documentation) then there should be a documentation task/user story in your product backlog.
Appropriate priority should then be placed for that task.
For documentation the key point is;
Document only what you need, only when you need it.
You can't just have user stories
somehow the functionality of the
program has to be documented. Do you
end up with a specifications document
with scrum? If you do do you end up
assigning time to do this onto the
task?
Why can't you just have user stories? What purpose do these specification documents serve? What value does the investment in producing these documents return? Does the benefit out weigh the cost? If not, then isn't the time spent creating, and more importantly maintaining, these documents waste?
I know I'm asking more questions than providing answers, but I think part of what Scrum and other Agile approaches like lean do is force you to re-examine your current practices and see if they still make sense.
In the case of specifications, who will be updating and maintaining these documents once the feature has launched? In most companies I've been at, the documentation has been sparse, out of date, or rarely referenced.
Instead, why not use executable tests or BDD so that the documentation becomes part of the code and is executable. For example, see Ben Mabey's talk on Cucumber
If for some reason, a specific type of document is required for legal compliance purposes, you can always add it to the teams' definition of "done", however, I've found in most cases, stories and tests are more than sufficient forms of documentation.
Maybe my understanding of the question is completely wrong, but I what I understood was that the OP was uncomfortable with the mismatch between user stories and requirements. With reason I'd say.
In my opinion, user stories tell how a chunk of functionality shall be demonstrated to the product owner. The language of the story can be something that can be understood by the product owner but mainly by the developers. You might have stories that describe things that are not even directly required by the owner, but are breakdowns of things that are.
Requirements in the other hand are a detailed specification in domain user's language of what the system needs to do in order to be valid. In many cases a requirements document is not optional (fixed price projects for example).
What I do is a mix of both. I have a requirements document, and in most of my scrum stories, I have something in my notes that link that story with one or more items of the requirements. It is as simple as "See FR-042 and FR-45" (FR for functional requirement for example)
I think you are asking for a few different things here. If you are adding a new team member, then the documentation for the system should be geared toward their role on the team as part of the on-boarding process.
If you are talking about documenting the system functionality; in our organization our training teams document the functionality as part of the release. They are engaged (as a stakeholder) during the Sprint Review (demo) and then provided a training environment with the new functionality to prepare the training materials prior to release.
If you are talking about providing documentation for tractability, your backlog can serve as that with the proper process & controls added.
Each one of these different items takes planning and deliberate process development to effectively function and meet the needs of the team. We have included each one of these items in our retrospectives as an issue was identified and then developed our processes over time.
In addition to what James Kolpack said, the user story map should persist after the project is finished as it too is a form of documentation. I believe we plan to some how convert it to a document that lives in our Wiki when all is said and done.
The idea is that this document will be useful for people who need to maintain the system or add enhancements to it in the future because they will have an understanding of the user's perspective.
I mostly agree with Todd, but there was times when part of my team's task was to produce documentation : Documentation was the user story itself our PO asked to be delivered.
In these cases we followed the following guidelines:
try as much as we can to extract documentation from actual working code (typically some document generation program that read internal data structures or configuration files used both for building the actual program and build documents).
define in the documentation US the goal of the documentation :
who the reader is supposed to be
what he should be able to achieve reading that document.
In my experience that makes documents easier to write and enable some kind of test (you ask to someone, typically PO, to read the doc and say if it's OK considering the goal).
You write documentations to validate your system. User stories serve the same purpose if written correctly in a format that reflects user interaction with the system. I will recommend using BDD and writing stories using Gherkin syntax. Eventually your scenarios become your acceptance criteria which helps in validation whether the system is working correctly or not.
We have a docs team that produce the "instruction manual" for our product. The manual is structured around the main features of the product, and the tasks that the user can perform in those features.
Each sprint, Scrum teams work on user stories that add functionality to the product features.
After sprint planning, the docs team meet with the Scrum team and see which user stories that will be developed this sprint. The docs team then start enhancing the instruction manual by writing the initial docs. During the sprint, the docs team follow the progress of the user stories, and can use the product as it's deployed to test environments. At the end of the sprint, the docs team finialise the updated instruction manual and add final screen shots etc...
The instruction manual ship as part of the release each sprint.

How to prove to our users that they are not being cheated? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have an information theory question about how to prove (or at least give statistical evidence) that an auction website is not shilling its users.
We recently launched a pay-per-bid auction website. It is a new type of auction where the users pay to bid on timed auctions. Each bid raises the price and increases the time of the auction. The last bidder when the time runs out gets to buy the item.
The problem is that users are suspicious that we may be cheating them. I have no such intentions as the trust of my users is of paramount importance to me. However, the model could be implemented by other unscrupulous sites and it would be straightforward to cheat bidders. I need to put measures in place that will show our users that we are legitimate.
I am committed to running an honest operation. The challenge is how to prove this to the world? Any approach will need to be balanced with preserving the privacy of users.
Some ideas I have are:
show IP address of each user
solicit testimonials from winners who
have received their merchandise. Have
them mail in photos of them with
their merchandise and a recent cover
copy of their local paper.
show some broad information about each user, such as home state and country
I am looking for any suggestions.
Update
Some great suggestions. So far:
Provide behavioral information about each users:
when joined
which auctions took part of
stats for auction - bids placed, cost
do not publish personally identifiable information. No IP address, since people who did not win could exact retribution on the winner.
public forum for discussion and address questions
solicit testimonials from users to show that people do win and do receive products.
how can we show in the testimonial that it is not "invented" by us? I am thinking of perhaps asking to include a photo with a recent local newspaper. This would be hard to fake on a large scale, and how distribution of winners through time and locality.
Do you believe it would be OK to show the home State and Country of user, or would that be too much personal information?
Provide as much information as possible to users, such as who won, how much was paid, how many buys/sellers a user has made, etc;
Provide a feedback mechanism on individual auctions and users;
Have a public forum for discussion on results, support issues and complaints by users;
Don't require users to use other pay services you provide to get results, such as your own snipe system;
State your policies clearly on your Website. This should include, at a minimum, a privacy policy, discussion of how the site works, an FAQ and steps you've taken to prevent any appearance of impropriety or conflict of interest (eg employees aren't allowed to participate); and
Have a complaints and dispute resolution mechanism.
This isn't a technical problem. It's a social problem. The only way users are going to feel confident in the results is with transparency and professionalism.
Isn't that something like swoopo.com?
It first and foremost must be designed well and look professional enough that people will trust it. People are remarkably good at detecting a poorly designed website and will not respond well to it.
This may be a hard market to get into since there are such well established alternatives, but the best way to gain users is by word of mouth from existing users. This takes time, but is most effective.
Don't go violating people's privacy and publishing their information jsut because they use your site. People won't like it and won;t come back.
Provide a feedback system for users (a la ebay) where people can see other real people that are pleased with the service.
Also a public message board for comments and complaints would help comfort people as well. Good Luck!
Be ware of providing too much information though, depending on your site, your users may decide that they do not like it when too much privacy is revealed to others when they bid on something. For example, if I'm a customer and I just purchased something expensive, I do not want my user name or email shown to other people who'll start spamming me to buy a cheaper version of what I just paid for. Some others may take offense at being out bid and grief the person who out bid them by running a DOS on their IP, for example.
Yes you should protect your own site's reputation, but if you do not take actions to protect your users, you may end up losing some of them.
I think the best way to improve your reputation is through usage (may be hard), or through some reputable review sites.
Giving out IP addresses of users might be risky, and ultimately it's something that a fake site may fake as well.
I guess one way of gaining trust is to use a trustworthy authority to approve you. IOW, delegation :) let someone else solve the problem for you. e.g. Users will tend to trust you more if you're backed by someone like PayPal. That would cost you, though.
[philosophical]
The main problem is that in order to gain trust you need to provide what sociologists call "honest signals". And honest signals are usually costly. That's a problem in business because it means you have to sacrifice your earnings in order to get more customers on-board, and then balance that equation. IOW, customers and shareholders have different incentives. But as a "starup" trying to gain the trust of a user base it would make sense to signal your honesty by costly gestures. You might make less money initially, but eventually, once you're big enough, that signalling would no longer be necessary.
So what kind of honest (costly) signal can you send? Well, maybe instead of soliciting testimonials from winners you should Pay them a symbolic fee. Make it worthwhile for users to help you prove the site's authenticity by disclosing information about themselves or the transaction, and in turn make it up for them with discounts, rebates, whatever.
Anyway, I'm pretty sure you won't gain trust by simply handing out people's information to everyone without asking them. Let people do that for you, and compensate them, thereby signalling your intentions in a costly (honest) manner.
[/philosophical]
Have real time chat on the bidding pages, like IRC. People can only bid by typing "#bid $200" or something in the chat window. That way users can interrogate anyone they think might be a bot or whatever. They can also discuss the product for sale and warn others if it's a fake listing or whatever. You need to show people they can trust the site. People trust people.
Remember sitting through a talk on use of cryptographic methods to prove various facets of auctions were conducted properly. Googling "cryptography" and "auctions" together should provide some starting material if your interested in this approach.
http://www.youtube.com/watch?v=IzVCrSrZIX8
http://www.cs.virginia.edu/crab/Auctions.ppt
Jeff Atwood talked about this on www.codinghorror.com last month.
http://www.codinghorror.com/blog/archives/001261.html
I had never heard of the concept before. He does explain it fairly well.
Cathy
You cant without lying.
The only way to win is by sheer luck. You just call your lottery tickets "bids".
Shame on you.
EDIT:
Some opinions on penny auctions
Profitable Until Deemed Illegal
Penny Auctions: They're Gambling
Open source?
This is a matter of trust and so is a social, not technical issue.
Even if you open-sourced the code and/or had an information theoretical proof, how many of your customers would understand it?
In situations like this, many companies rely on a the word of a trusted third party who has inspected the company's operations. The third party stakes its reputation on its public statement that the company is doing business correctly.

Resources