sbt assembly: deduplicate module-info.class - sbt

I get the following error when assembling my uber jar:
java.lang.RuntimeException: deduplicate: different file contents found
in the following: [error]
/Users/jake.stone/.ivy2/cache/org.bouncycastle/bcprov-jdk15on/jars/bcprov-jdk15on-1.61.jar:module-info.class
[error]
/Users/jake.stone/.ivy2/cache/javax.xml.bind/jaxb-api/jars/jaxb-api-2.3.1.jar:module-info.class
I am not up to date with java technology, but assume I cannot simply discard one of these classes.
Can someone tell me what mergeStrategy I can use to safely compile the uber jar?

The answer depends on your environment and what you want to achieve.
JDK 8
I had the same problem with a project using JDK 8. JDK 8 does not use the file module-info.class so it is safe to discard the file.
Add the following to your build.sbt:
assemblyMergeStrategy in assembly := {
case x if x.endsWith("module-info.class") => MergeStrategy.discard
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
This simply discards the files.
JDK 11
If you use JDK 11 with an end user project (not a library) then it should also be safe as if you create the uber-jar all classes are included and no external dependencies are needed. Just tested it with a short test (not thoroughly enough to say it is always safe).
If you use JDK 11 and create a library then it is better not to create an uber-jar. In this case a dropping of the module-info.class will most likely create a jar that will not work. In this case simply depend on the libraries.

The module-info.class file has been moved in many libraries.
Here is the updated solution
assembly / assemblyMergeStrategy := {
case PathList("module-info.class") => MergeStrategy.last
case path if path.endsWith("/module-info.class") => MergeStrategy.last
case x =>
val oldStrategy = (assembly / assemblyMergeStrategy).value
oldStrategy(x)
}

Related

Execute sbt task before packaging of fat-jar

I wrote a small sbt plugin for some resource files editing in project's target directory (actually, it just works similary to maven profiles). Now, when I wrote and tested my simple custom sbt task (let's call it interpolateParameters), I want it to be executed between resource copying and jar creation when running sbt assembly. However, I can't find any documentation about which tasks are executed "under the hood" of assembly task provided by sbt-assembly plugin. And actually I doubt is it even possible.
Therefore, I have 2 questions: is it possible to somehow execute my task between sbt assembly's compile + copyResources and "create jar" steps? And if not, is there a way to achieve what I want without creating my own fork of sbt-assembly plugin?
I solved this with making assembly depends on my task interpolateParameters, and interpolateParameters depends on products. Here is part of my resulting build.sbt file with solution:
lazy val some<oduleForFatJar = (project in file("some/path"))
.dependsOn(
someOtherModule % "test->test;compile->compile"
)
.settings(
name := "some module name",
sharedSettings,
libraryDependencies ++= warehouseDependencies,
mainClass in assembly := Some("com.xxxx.yyyy.Zzzz"),
assemblyJarName in assembly := s"some_module-${version.value}.jar",
assembly := {
assembly dependsOn(interpolateParameters) value
},
interpolateParameters := {
interpolateParameters dependsOn(products) value
},
(test in assembly) := {}
)
Hope it can help someone.

How to publish an artifact with pom-packaging in SBT?

I have a multi-project build in SBT where some projects should aggregate dependencies and contain no code. So then clients could depend on these projects as a single dependency instead of directly depending on all of their aggregated dependencies. With Maven, this is a common pattern, e.g. when using Spring Boot.
In SBT, I figured I can suppress the generation of the empty artifacts by adding this setting to these projects:
packagedArtifacts := Classpaths.packaged(Seq(makePom)).value
However, the makePom task writes <packaging>jar</packaging> in the generated POM. But now that there is no JAR anymore, this should read <packaging>pom</packaging> instead.
How can I do this?
This question is a bit old, but I just came across the same issue and found a solution. The original answer does point to the right page where this info can be found, but here is an example. It uses the pomPostProcess setting to transform the generated POM right before it is written to disk. Essentially, we loop over all the XML nodes, looking for the element we care about and then rewrite it.
import scala.xml.{Node => XmlNode, NodeSeq => XmlNodeSeq, _}
import scala.xml.transform._
pomPostProcess := { node: XmlNode =>
val rule = new RewriteRule {
override def transform(n: XmlNode): XmlNodeSeq = n match {
case e: Elem if e != null && e.label == "packaging" =>
<packaging>pom</packaging>
case _ => n
}
}
new RuleTransformer(rule).transform(node).head
},
Maybe you could modify the result pom as described here: Modifying the generated POM
You can disable publishing the default artifacts of JAR, sources, and docs, then opt in explicitly to publishing the POM. sbt produces and publishes a POM only, with <packaging>pom</packaging>.
// This project has no sources, I want <packaging>pom</pom> with dependencies
lazy val bundle = project
.dependsOn(moduleA, moduleB)
.settings(
publishArtifact := false, // Disable jar, sources, docs
publishArtifact in makePom := true,
)
lazy val moduleA = project
lazy val moduleB = project
lazy val moduleC = project
Run sbt bundle/publishM2 to verify the POM in ~/.m2/repository.
I dare say this is almost intuitive, a rare moment of pleasant surprise with sbt 😅
I confirmed this with current sbt 1.3.9, and 1.0.1, the oldest launcher I happen to have installed on my machine.
The Artifacts page in the reference docs may be helpful, perhaps this trick should be added there.

OutofMemoryErrory creating fat jar with sbt assembly

We are trying to make a fat jar file containing one small scala source file and a ton of dependencies (simple mapreduce example using spark and cassandra):
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import com.datastax.spark.connector._
import org.apache.spark.SparkConf
object VMProcessProject {
def main(args: Array[String]) {
val conf = new SparkConf()
.set("spark.cassandra.connection.host", "127.0.0.1")
.set("spark.executor.extraClassPath", "C:\\Users\\SNCUser\\dataquest\\ScalaProjects\\lib\\spark-cassandra-connector-assembly-1.3.0-M2-SNAPSHOT.jar")
println("got config")
val sc = new SparkContext("spark://US-L15-0027:7077", "test", conf)
println("Got spark context")
val rdd = sc.cassandraTable("test_ks", "test_col")
println("Got RDDs")
println(rdd.count())
val newRDD = rdd.map(x => 1)
val count1 = newRDD.reduce((x, y) => x + y)
}
}
We do not have a build.sbt file, instead putting jars into a lib folder and source files in the src/main/scala directory and running with sbt run. Our assembly.sbt file looks as follows:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")
When we run sbt assembly we get the following error message:
...
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: java heap space
at java.util.concurrent...
We're not sure how to change the jvm settings to increase the memory since we are using sbt assembly to make the jar. Also, if there is something egregiously wrong with how we are writing the code or building our project that'd help us out a lot too; there's been so many headaches trying to set up a basic spark program!
sbt is essentially a java process. You can try to tune your sbt runtime heap size for the OutOfMemory issues.
For 0.13.x, the default memory options sbt uses is
-Xms1024m -Xmx1024m -XX:ReservedCodeCacheSize=128m -XX:MaxPermSize=256m.
And you can enlarge the heap size by doing something like
sbt -J-Xms2048m -J-Xmx2048m assembly
I was including spark as an unmanaged dependency (putting the jar file in the lib folder) which used a lot of memory because it is a huge jar.
Instead, I made a build.sbt file which included spark as a provided, unmanaged dependency.
Secondly, I created the environment variable JAVA_OPTS with the value -Xms256m -Xmx4g, which sets the minimum heap size to 256 megabytes, while allowing the heap to grow to a maximum size of 4 gigabytes. These two combined allowed me to create a jar file with sbt assembly
More info on provided dependencies:
https://github.com/sbt/sbt-assembly
I met the issue before. For my env, set Java_ops doesn't work.
I use below command and it works.
set SBT_OPTS="-Xmx4G"
sbt assembly
There is no issue of out of memeory.
this works for me:
sbt -mem 2000 "set test in assembly := {}" assembly

Adding /etc/<application> to the classpath in sbt-native-packager for debian:package-bin

So I'm using the packageArchetype.java_server and setup my mappings so the files from "src/main/resources" go into my "/etc/" folder in the debian package. I'm using "sbt debian:package-bin" to create the package
The trouble is when I use "sbt run" it picks up the src/main/resources from the classpath. What's the right way to get the sbt-native-packager to give /etc/ as a resource classpath for my configuration and logging files?
plugins.sbt:
addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "0.7.0-M2")
build.sbt
...
packageArchetype.java_server
packageDescription := "Some Description"
packageSummary := "My App Daemon"
maintainer := "Me<me#example.org>"
mappings in Universal ++= Seq(
file("src/main/resources/application.conf") -> "conf/application.conf",
file("src/main/resources/logback.xml") -> "conf/logback.xml"
)
....
I took a slightly different approach. Since sbt-native-packager keeps those two files (application.conf and logback.xml) in my package distribution jar file, I really just wanted a way to overwrite (or merge) these files from /etc. I kept the two mappings above and just added the following:
src/main/templates/etc-default:
-Dmyapplication.config=/etc/${{app_name}}/application.conf
-Dlogback.configurationFile=/etc/${{app_name}}/logback.xml
Then within my code (using Typesafe Config Libraries):
lazy val baseConfig = ConfigFactory.load //defaults from src/resources
//For use in Debain packaging script. (see etc-default)
val systemConfig = Option(System.getProperty("myapplication.config")) match {
case Some(cfile) => ConfigFactory.parseFile(new File(cfile)).withFallback(baseConfig)
case None => baseConfig
}
And of course -Dlogback.configuration is a system propety used by Logback.

How do you do develop an SBT project, itself?

Background: I've got a Play 2.0 project, and I am trying to add something to do aspectj weaving using aspects in a jar on some of my classes (Java). (sbt-aspectj doesn't seem to do it, or I can't see how). So I need to add a custom task, and have it depend on compile. I've sort of figured out the dependency part. However, because I don't know exactly what I'm doing, yet, I want to develop this using the IDE (I'm using Scala-IDE). Since sbt projects (and therefore Play projects) are recursively defined, I assumed I could:
Add the eclipse plugin to the myplay/project/project/plugins.sbt
Add the sbt main jar (and aspectj jar) to myplay/project/project/build.sbt:
libraryDependencies ++= Seq(
"org.scala-sbt" % "main" % "0.12.2",
"aspectj" % "aspectj-tools" % "1.0.6"
)
Drop into the myplay/project
Run sbt, run the eclipse task, then import the project into eclipse as a separate project.
I can do this, though the build.scala (and other scala files) aren't initially considered source, and I have to fiddle with the build path a bit. However, even though I've got the sbt main defined for the project, both eclipse IDE and the compile task give errors:
> compile
[error] .../myplay/project/Build.scala:2: not found: object keys
[error] import keys.Keys._
[error] ^
[error] .../myplay/project/SbtAspectJ.scala:2: object Configurations is not a member of package sbt
[error] import sbt.Configurations.Compile
[error] ^
[error] .../myplay/project/SbtAspectJ.scala:3: object Keys is not a member of package sbt
[error] import sbt.Keys._
[error] ^
[error] three errors found
The eclipse project shows neither main nor aspectj-tools in its referenced-libraries. However, if I give it a bogus version (e.g. 0.12.4), reload fails, so it appears to be using
the dependency.
So,...
First: Is this the proper way to do this?
Second: If so, why aren't the libs getting added.
(Third: please don't let this be something dumb that I missed.)
If you are getting the object Keys is not a member of package sbt error, then you should check that you are running sbt from the base directory, and not the /project directory.
sbt-aspectj
sbt-aspectj doesn't seem to do it, or I can't see how.
I think this is the real issue. There's a plugin already that does the work, so try making it work instead of fiddling with the build. Using plugins from build.scala is a bit tricky.
Luckily there are sample projects on github:
import sbt._
import sbt.Keys._
import com.typesafe.sbt.SbtAspectj.{ Aspectj, aspectjSettings, compiledClasses }
import com.typesafe.sbt.SbtAspectj.AspectjKeys.{ binaries, compileOnly, inputs, lintProperties }
object SampleBuild extends Build {
....
// precompiled aspects
lazy val tracer = Project(
"tracer",
file("tracer"),
settings = buildSettings ++ aspectjSettings ++ Seq(
// stop after compiling the aspects (no weaving)
compileOnly in Aspectj := true,
// ignore warnings (we don't have the sample classes)
lintProperties in Aspectj += "invalidAbsoluteTypeName = ignore",
// replace regular products with compiled aspects
products in Compile <<= products in Aspectj
)
)
}
How do you do develop an SBT project, itself?
If you're interested in hacking on the build still the first place to go is the Getting Started guide. Specifically, your question should be answered in .scala Build Definition page.
I think you want your build to utilize "aspectj" % "aspectj-tools" % "1.0.6". If so it should be included in myplay/project/plugins.sbt, and your code should go into myplay/project/common.scala or something. If you want to use IDE, you have have better luck with building it as a sbt plugin. That way your code would go into src/main/scala. Check out sbt/sbt-aspectj or sbt/sbt-assembly on example of sbt plugin structure.

Resources