Comparison between various ETL tools

Comparison between various ETL tools - oracle11g

Very often I get into the projects that have requirements of transferring file data into table. And almost always I've worked at ODI (Oracle Data Integrator) only.
I want to know what are the different ETL tools available and how are they different from ODI and what are the restrictions in each case (like file size limit or column size restriction or processing time etc).
I wish somebody could help.
If somebody can share personal experience on these tools, that would be welcome too. Thanks!

I'm working on the same type of projects that you're in.
Right now I'm working with IBM DataStage. It seems like a good and powerful tool, but it's lacking a good documentation and a strong community.
There's also Pentaho, I have no experience about it, but it seems pretty popular and it's also open source

Related

Open Source Identity vs. Real Life Identity

I maintain 2 identities one for open source development - which doesn't really contain any personal information. I also have another identity obviously - my real one.
This may be community wiki - but my question is programming related in that when you put software out there, you publish it with some name as the author, and that choice may have real life consequences.
I am considering merging my identities, what are the pro's and con's of this? Is it a good idea, or do privacy concerns outweigh the convenience of maintaining a single identity.
(By the way, this second identity was created out of my World of Warcraft addon development, and I have just continued using it for my open source projects)
Edit: I am considering this, because I am thinking of changing jobs, and I want to refer to my open source work without it looking unprofessional due to the author naming.

Well, as a part-time open-source hacker, I've recently discovered that ohloh can help you "professionnalize" your identity by allowing you to reclaim all the commits you've done in projects knwon by this engine (and they're numerous).
As a consquence, instead of merging your identities, I would suggest you give them some weight by marketting the contributions you've done.
Besides, I've never considered as valid the fact that commiting for a game plugin as open-èsource activity was not that professionnal. It is code, and code used by non-developpers, which must be noted.

In many professions using a pseudonym for publishing works has a long tradition until today: Artists, writers, etc.
Is it really unprofessional for a software developer to do the same?

If you are good, why not get a little famous? Who knows, if person hiring you is not using/participating in open source project and you'll be valued more from the start?

Which DVCS is most conducive to experimenting?

I was wondering which DVCS is most conducive to experimentation i.e. branching, etc. I want something where anyone can quickly launch smaller projects and refactor code quickly. I want to create an environment where experimenting is cheap and can be discarded/merged easily.

Git is known for very cheap branching, they made it so that branching was something trivial, so that, like you said, you could create branches for any little thing. I don't have experience with the other DVCSes, but I imagine they're pretty similar given their similar nature. I just know that cheap branching is one of Git's reasons for creation, or something like that. Sorry if I misunderstood your question.
Here's a section of a popular article/site giving details about git over other version control systems.
In response to your comment: On windows I imagine? I've been fine using msysgit, get msysGit-fullinstall-1.6.4-preview20090729. For a detailed walkthrough with screenshots that helped out some friends, I recommend the Git for Windows Developers series.

You could also try Mercurial, it's fast, it's distributed and it's easier to use. If you like working with a GUI try -- TortoiseHg.
Here is an analysis done by google before they integrated mercurial into google code.

Your requirements match Darcs or Git.

If you're a GUI user, why don't you take a look at Plastic SCM? http://codicesoftware.blogspot.com/2010/03/distributed-development-for-windows.html. It's one of the few commercial DVCSs out there and it's focused on ease of use but it has all the features you're looking for:
Excellent branching and merging support (full merge tracking, rename support and all that)
Distributed (and easy to use)
Subtractive merge support (you can do it from the GUI)
Besides:
Very good visualization
Excellent Windows GUI (check it)
Excellent VStudio integration

Fitnesse vs any other subsystem testing tool [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
We are currently using Fitness for subsystem testing.
we are having lot of issues using the tool, few to mention
Development time for writing Fixture is more then writing the actual code
Issues around check in of the dlls so that Qa can test them
Issues in running Fitnesse for project which uses NHibernate
limited help online
We are planning to use some other tool to do the testing
Few options which we know are
SOAP UI
Story teller
I am not sure whether we will have similar problems with these tools
It would be great to know if someone has experience using these tool and could guide us
In our project we have adopted TDD so we have Nuits for unit testing.
It would be great if anyone is aware of tools/ideas which could extend nunits for subsystem testing as well.

Component testing tools are all about calling functions. Your tests cause functions to be called in "fixtures" that then call into the SUT. Any tool based on this premise will encounter the problems you reference above.
However, most of those problem are manageable. For example you should not be writing lots of fixtures. If you are, something is wrong. Secondly, your fixtures ought to be little more than wiring code to call the APIs in your application. If your fixtures are doing significant work, then something is wrong.
In most FitNesse environments the number of fixtures is rather small. For example, there are over two hundred acceptance tests for fitnesse itself, but the number of fixtures in on the order of a dozen, and they are all relatively simple.
Get help on the fitnesse#yahoogroups.com site. The folks there are usually very responsive to questions.

If you can communicate with your software using text, then I have had success on past projects rolling my own framework using expect.
The framework I cooked up stored tests as XML files, using a simple xUnit style markup. The xml files were then transformed into executable tests using a stylesheet. I ended up transforming the tests into Tcl/Expect, but you could transform them into anything. In fact, if you wanted, you could transform them into multiple languages, depending on your needs.
Several people have kindly reminded me (in the same way you remind you poor dottering grandfather about the drool on his chin) that we are in the 21st century when they inquire why I would choose Tcl over some more modern language. As it turns out, for the purposes of this kind of testing, I haven't yet found a better choice. The Tcl language still kicks butt in this area. Trust me, I didn't wake up one day and say to myself "self, what I need a test framework implemented in a scripting language everyone will hate!"
Believe it or not, I really was looking for a tool, any tool, that had the following characteristics:
Cross platform. This was non-negotiable. We do a lot of cross platform development and we already use WAY too many tools that don't support cross platform development.
Simple syntax. Say what you want about Tcl, but the syntax is very regular. I knew that some native code would probably creep even into the XML files (and originally it was Tcl only, no XML) and I wanted the syntax to be comprehensible to a non-programmer. This simplicity is a core strength of Tcl. As it turns out, it also made transforming the XML easier too.
Free. My favorite price ;-)
Writing tests as simple xml files allowed non-programmers to write customer acceptance level tests - no programming required.
Easily extended.
I did not set out to home grow this to the extent I have. Initially, I looked at established test frameworks like DejaGnu and android. Mostly they had way too many features. They were so feature laden that I didn't think they would be easy for a project to start using without a lot of up front training. Looking at DejaGnu, got me interested in Tcl in general, and after a brief look at tcltest, I almost gave up. Both DejaGnu and tcltest assume you are an advanced Tcl scripter, which I didn't think anyone at my company ever would be. In addition, I wanted the test framework (if possible) to support an xUnit type of test framework and neither of these tools did.
Eventually I found TclTkUnit, a Tcl based testing framework that is designed along xUnit lines. It was only a short leap of logic to realize I could run TclTkUnit in Expect instead of tclsh and get everything I needed.
As it ended up getting used more, I added another stylesheet to render the xml files nicely in a web browser. The test framework generated it's own documentation.
On another project we needs a very basic sim / stim environment to emulate a person throwing switches and pushing buttons on a piece of hardware we didn't have. It only took a few hours to hack the test framework to function as a simulator. Creating the framework took some work, but we felt that it did pay benefits in the long run. I really believe that these types of unforseen consequences of creating your own tools is why people in the agile community & XP in particular have always been such strong advocates.

We have adopted a Fitnesse-based but practically-code-free approach using GenericFixture (google for Anubhava to find his wordpress site) for Fitnesse.
What this allows us to do is to create "executable test narratives" using a language that is friendly to the business-side (as opposed to the technical-side). This language, which is very easily defined, practically without coding, in Generic Fixture, is called a DSL (domain specific language). So we can write our test narratives using e.g. medical terms or even in a language other than English. Basically what we get is transforming our Use Cases into executable narratives.
We are starting to use it in a large project (15 ppl for 2 years) and it seems (so far) to have a good future.
It easily allows Test Driven Development or test-creation after development (traditional approach).
It is wiki-based (Fitnesse) and its versioning and refactoring funcitonality has proven so far sufficient.
I can give more info if anyone is interested.
best regards,
Aristotelis.

We use unit-testing frameworks like NUnit to drive our subsystem tests as well - the tests don't care how they are run. It doesn't have fitnesse's document-based approach, though.

Simulating a TWAIN Device

Our company is using some software that ONLY accepts input from an "Imaging Device" i.e. a TWAIN device (e.g. scanner).
The problem is that we are receiving our files digitally, so using an actual scanner would require us to print, scan, and shred documents that we already have on the computer, but not in the software.
I was curious if anybody has any idea of how we might be able to work around this problem in the meantime. My first thought was to find some way to trick the program into thinking we're using a scanner, via some new 'imaging device' that would just read in the file, and spit it out to the software, but I don't even know where to begin with that.
We put in a feature request, seeing as how this problem should obviously be addressed in the software itself, but the company is notorious for lagging pretty hard when it comes to updates.

The system used by scanners is called TWAIN, so you'd be looking for some sort of virtual twain driver.
A quick google search will produce several hits, I don't have any experience with the software myself so can't advise any further.
Two such providers I found via experts exchange:
http://www.twaintools.de
http://www.scanpoint-usa.com

OK, months late... but in case you are interested, I have a TWAIN driver framework/toolkit that might let you build this fairly easily, depending on just what your scanning app expects, and how hard it is to read images from your digital documents. It's a Microsoft Visual C++ project. No charge but you'd need our permission to redistribute a driver based on it: GenDS
The TWAIN Working Group also has a sample/skeleton driver, I think it's straight C - and used to have some rather bad bugs (Why I wrote mine ;-) but, it might have got better.
Look for the "sample data source and application" on their download page.
And of course I have a 'commercial' version of GenDS that I use to write TWAIN drivers on contract.

a simple .net website source control system?

I work in Visual Studio working on sites mostly myself and occasionally I start on new features for a site and bam a bug pops up on the live site and now I am in the middle of changes and can't post a fix to the bug until everything I started to change is complete.
So I am looking for a nice an simple way to work with this type of situation - any suggestions?

Are you asking for a recommendation of a source control system? SourceGear Vault is free for single users.

I am big fan of subversion. There also plugins for VS to work with subversion repository.
http://subversion.tigris.org/
http://ankhsvn.open.collab.net/

I am in a similar situation and I use Perforce. It is free for up to two users and integrates well with Visual Studio.

Subversion is well supported and has tools for most any environment. It's also mostly straightforward to use, so you should be able to get up and running quickly.
If you need to work on a lot of separate features and bugs at the same time, you might try Mercurial instead. The tooling support is a lot less mature but I find the distributed design to do a better job of merging and facilitating work on separate issues concurrently.
But really, if you aren't using anything currently and aren't sure what your needs are, just choose one that has support in the IDE/tools you use. It will probably be Subversion.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex