Elastic Map Reduce External Jars - jar

So, it is easy enough to handle external jars when using hadoop straight up. You have -libjars option that will do this for you. The question is how do you do this with EMR. There must be an easy way of doing it. I thought -cachefile option of the CLI would do it, but I couldn't get it working somehow. Any ideas anyone?
Thanks for the help.

The best luck I have had with external jar dependencies is to copy them (via bootstrap action) to /home/hadoop/lib throughout the cluster. That path is on the classpath of every host. This technique is the only one that seems to work regardless of where the code lives that accesses external jars (tool, job, or task).

One option is to have the first step in your jobflow set up the JARs wherever they need to be. Or, if they are dependencies, you can package them in with your application JAR (which is probably in S3).

FYI for newer versions of EMR /home/hadoop/lib is not used anymore. /usr/lib/hadoop-mapreduce should be used.

Related

How can I find unused libraries in a Symfony project with PhpStorm?

I try to find and also to delete all the unused libraries in a project. For example I have a folder lib/ with lot of other folder which are the famous libraries. I want to know how I can identify which libraries are not used.
I asked the same question here but the only response I received suggests to me to check each file one by one ...
Can you help me?
I don't think that is possible, as some libraries may be lazy loaded depending on some internal state of your application.
So even if you could somehow find all strong typed references inspecting the code, you have no way of finding out if a library is loaded via magic methods, custom class loaders, dynamically generated include or require statement, eval-ed code and so on.
Without having tests with 95%+ coverage for your non-library code, it is very risky to remove anything from your lib folder. You code may appear to run fine, but still fail in some edge cases.
There's an open source project that can help you to do that:
https://github.com/composer-unused/composer-unused
Installation
composer require composer-unused/composer-unused-plugin
Usage
composer unused
And if you want to use it inside phpstorm, you can look at their composer documentation: https://www.jetbrains.com/help/phpstorm/using-the-composer-dependency-manager.html#create-and-run-composer-scripts

Can MR-Jars overwrite classes from other jars?

I have a jar that works on Java 8.
I would like to create a new jar, that is going to be Multi-Release JAR but empty, just with 'patched' classes in META-INF/versions.
I would like to have a separate jar, so people can include it on Java9, otherwise, they use the default one. Why? Because so many tools are not yet prepared for Java9 MR-Jars.
Would this be possible? Would Java9 MR-Jar override classes from others jars?
Why?
The idea behind Multi-Release jars is that they provide simple patching. In my humble opinion, the way MR jars works is not satisfying.
There are two reasons why I can't make 2 separate Jars:
try to make cross-compile source base that works with Java8 and Java9. You would end up with folders like java, java8 and java9... and then have the build produce two jars, two poms... Yeah, good luck.
Imagine that I even build a library for java9. What about transient dependencies? That would mean that all other libraries that uses mine, would need to have jre8 version that depends on my jre8 version. Just because there is Java9 version!
Here is the story:
My A is a Java library built on Java8 but packaged as Multi-Release Jar which means it contains additional classes for when jar is run on Java9. Additional classes are built separately on JDK9 and I copied them manually (yeah, I know, but it works for now).
Unfortunately, some tools and servers (Jetty) are not aware of MR Jars and this makes them NOT working.
For that reason, I have A-jre8 version of my library, that comes without any extra classes, so servers can use it.
However, if user is using library B that depends on my A, he will still get the MRJar version of A and this will fail again. I want to be able to prevent this somehow. And I can't say to B: hey, could you make B-jre8?
Possible solution
JAR is just about packaging!
Allow the separate jar to patch existing jar.
In my case, I would just include A.jar9 and Java would consider A.jar and A.jar9 together as a package. No need for META-INF/versions. Very clean. And, best of all, it would help in situations like above! If run on Java8, the jar9 jar would make no difference; if run on Java9 the jar9 jar would patch the jar with the same name. Simple as that. No transitive dependency hell.
Rename classes in META-INF/versions.
Common Oracle, have you ever heard about the classpath scanning? Could you at least rename the classes in versions to e.g. *.class9 so not to be caught by existing classpath scanners.
As it is today (Java v9.0.4) - no.

Can we use GIT for Oracle Service Bus projects and BPEL projects?

We are planning to introduce GIT for OSB and BPEL project is it a correct option.. Not sure how it will compare the jars??
Yes you can. We are using git for OSB projects and it works really great.
After creating workspace, you should add it to your git repository. Thanks to that you will be able to track all files individually. You can of course store in that repository your jars, but because of comapring problems I would treat is as a backup.
I also believe that the same approach (adding workspace to git repo) can be used for BPEL projects, but I haven't tried it.
Yes, you can.
Both OSB and BPEL projects are simply .xml files. If you open the project, you will see .xml, .wsdl, .xsd, etc. All these files can be tracked by a version controll system (GIT, mercurial, SVN).
Source control is for source code and not for jars. In the case of OSB, the source code would be mostly XML files (proxies, pipelines, bix) or XQuery & XSL.
Git would work perfectly well - you would need to define and follow some practices around versioning, branching, tagging etc.
Exactly the same principles would work for SOA composites (that include BPEL).
I have an example right here: https://github.com/jvsingh/SOATestingWithCitrus
Yes , you may use GIT for both OSB and BPEL projects. We are also using GIT in our current project, but as far as comparing the jars is concerned you cannot compare the jars in GIT.
So what we are checking in the source code as well to track and compare the previous change.

Received a main jar file with other jar files that need to be in the classpath. Whats the best way to include this main jar in my maven project?

So i received a java api from a client and the main code is in main.jar. But the instructions he gave me require me to add these other jars (a.jar, b.jar, etc..) into the classpath whenever I want to use main.jar. These other supporting jars are things like Xerces, jakarta-oro, and a few other publicly available libraries. The problem is i don't know what versions they are, so i'm not sure if there would be issues if i just update the pom.xml file in my app to depend on main.jar and also have dependencies to these other jars as well with the latest versions of them.
Whats the best strategy for using main.jar in my maven application? I'm planning on adding main.jar to our internal maven repository, but what should i do about the xerces, jakarta-oro, and other jars when i dont know what versions they are?
Thanks
If you are lucky the file /META-INF/MANIFEST.MF inside a.jar, b.jar etc. contains an entry "Implementation-Version" or some other useful information which tell you what version they are. If not, you can download the latest release(s) from the project web site and check if they have the same file size as your bundled dependencies.
You may also come to the idea to bundle the dependencies with the main.jar in one big jar, but this may become funny, when you have the dependencies twice in your classpath at some point in the future...
What about just asking the client what version numbers this dependencies have?
If you don't have any information about these third-party libraries, just add them to src/resources/META-INF/lib and commit to SVN. That's the best way, if we're talking about black box approach.

Include another MSI file in my setup project

I'm trying to make a setup program for an ASP.NET web site. I need to make sure the target machine has sqlxml installed.
I must verify the target machine has the software installed, and if not, launch a .msi file either before or after the main installation.
I'm a complete newbie with setup projects, so maybe this is obvious, but after several hours browsing the web I haven't found a satisfactory solution. I've been reading about WiX, etc. but I'm looking (if possible) for a simple solution.
Thank you both!
I understand an installer can't run another one. I was thinking in a functionality similar to Prerequisites (in project properties). There I can check a component and it will be automatically installed if it isn't. I don't need to do anything else. But, the most important thing for me is that the installation won't run if it's not needed.
I also tried the .msm solution, but I couldn't find any. Maybe I can make one myself? I haven't tried it yet though.
Unfortunately, you can't run one installer from another, since only one can be running at a time. You need to chain them together and run one after the other. Google "msi chaining". This is often the reason why products like Visual Studio use an external setup.exe which then runs the installers one after the other.
Looks like you need to 'chain' the installs http://objectmix.com/xml-soap/84668-installing-sqlxml-net-app.html
You can get the redist here http://www.microsoft.com/downloads/details.aspx?FamilyID=51D4A154-8E23-47D2-A033-764259CFB53B&displaylang=en
CAn you add this as a pre-req for your install?
What are you using to build the create the install?
Edit:
I had a look to see how you can check of the SQLXML is installed and come across this:
http://www.tech-archive.net/Archive/SQL-Server/microsoft.public.sqlserver.xml/2005-04/msg00110.html
The system I am on just now has the following key HKEY_CLASSES_ROOT \ SQLXMLX (note the X at the end), so you might need to do a bit more investigation in to what the actual key is.
I'm not familer with Visual Studio install authoring but if you can add an entry to the AppSearch and RegLocator tables you should be able to check for the existance of the registry key when the install starts. See here
http://msdn.microsoft.com/en-us/library/aa371564(VS.85).aspx
The Reglocator table gives you the option to set a property with a value from the registry if found. You can then use this in the condition on a custom action.
A lot to put together, but I hope it move you in the right direction.
Brent's answer is correct. I would just add that, sometimes, you can find a "merge module" for the bits you depend on. That's a .msm file. You can certainly include 1 or more of those in your .msi file. I have no idea whether a merge module is available for SQLXML. Sorry.

Resources