I have an SBT project, with spark dependencies. These dependencies are provided at runtime, and hence I import them under provided scope.
val hadoop = Seq("org.apache.hadoop" % "hadoop-client" % "3.3.1" % provided)
val spark = Seq(
"org.apache.spark" %% "spark-core" % SparkVersion % provided,
"org.apache.spark" %% "spark-sql" % SparkVersion % provided,
"org.apache.spark" %% "spark-mllib" % SparkVersion % provided
)
lazy val coreDto = (project in file("xxxx"))
.enablePlugins(BuildInfoPlugin)
.enablePlugins(PackPlugin)
.settings(
name := "xxxx",
moduleName := "xxxx",
version := "1.0",
libraryDependencies ++= (hadoop ++ spark))
All is well till now. Now I have a new scenario, where I have to publish the jar to our maven repository. And successfully I am able to publish it. The issue now is: provided scope. As the compile time dependencies are not appropriately set.
Question: How do I configure, where the provided scope is ignored during publishing? But still considered when I package it?
Using this to publish in case if helpful
lazy val publishSettings = Seq(
publishMavenStyle := true,
publishTo := {
val url = "https://xxxxl/maven/v1/"
if (version.value.trim.toUpperCase.endsWith("SNAPSHOT"))
Some("snapshots".at(url))
else
Some("releases".at(url))
},
aetherDeploy / logLevel := Level.Info,
aetherOldVersionMethod := true,
credentials += Credentials(Path.userHome / ".sbt" / ".credentials")
)
I'm writing a simple scala project for dl4j. I need to switch between cuda (for training) and native for production. I seem to have a problem using native in an assembled jar. I get the below error:
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.datavec.api.util.ndarray.RecordConverter.toMinibatchArray(RecordConverter.java:197)
at org.deeplearning4j.datasets.datavec.RecordReaderMultiDataSetIterator.next(RecordReaderMultiDataSetIterator.java:159)
at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:364)
at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:439)
at simplecuda.main$.delayedEndpoint$simplecuda$main$1(main.scala:37)
at simplecuda.main$delayedInit$body.apply(main.scala:27)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at simplecuda.main$.main(main.scala:27)
at simplecuda.main.main(main.scala)
Caused by: java.lang.RuntimeException: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: http://nd4j.org/getstarted.html
at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5449)
at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:213)
... 15 more
Caused by: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: http://nd4j.org/getstarted.html
at org.nd4j.linalg.factory.Nd4jBackend.load(Nd4jBackend.java:213)
at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5446)
... 16 more
My build file is:
name := "simplecuda"
version := "1.0"
scalaVersion := "2.11.8"
// looks like you need to remove ~/.ivy2/cache and ~/.javacpp/cache whenever you switch between platforms
classpathTypes += "maven-plugin"
libraryDependencies ++= Seq(
"org.scalactic" %% "scalactic" % "3.0.5",
"org.scalatest" %% "scalatest" % "3.0.5" % "test",
// "org.nd4j" % "nd4j-cuda-9.2-platform" % "1.0.0-beta2",
// "org.deeplearning4j" % "deeplearning4j-cuda-9.2" % "1.0.0-beta2"
"org.nd4j" % "nd4j-native-platform" % "1.0.0-beta2",
"org.deeplearning4j" % "deeplearning4j-core" % "1.0.0-beta2"
)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
When I visit http://nd4j.org/getstarted.html to learn about the noAvailableBackendException I see that build.sbt should have the below line:
classpathTypes += "maven-plugin"
I've included this in the above build.sbt and without any luck. After looking at the gradle instructions I tried adding the "org.bytedeco.javacpp-presets" % "openblas" % "0.2.20-1.3" classifier "linux-x86_64" dependency and this also did not help.
I've tried removing ~/.javacpp/cache and ~/.ivy2/cache multiple times without any luck. The repo with this example is at https://github.com/tomlue/dl4j_scala_troubleshoot
I want to publish a Scala library with sbt using sbt-pgp 0.8. I've registered groupId org.bitbucket.sergey_kozlov at Sonatype.
My build.sbt:
organization := "org.bitbucket.sergey_kozlov"
name := "playingcards"
version := "0.1-SNAPSHOT"
publishMavenStyle := true
publishTo := {
val nexus = "https://oss.sonatype.org/"
if (isSnapshot.value)
Some("snapshots" at nexus + "content/repositories/snapshots")
else
Some("releases" at nexus + "service/local/staging/deploy/maven2")
}
publishArtifact in Test := false
pomIncludeRepository := { _ => false }
pomExtra :=
<url>https://bitbucket.org/sergey_kozlov/playingcards</url>
<licenses>
<license>
<name>The MIT License</name>
<url>http://www.opensource.org/licenses/mit-license.php</url>
<distribution>repo</distribution>
</license>
</licenses>
<scm>
<url>https://bitbucket.org/sergey_kozlov/playingcards.git</url>
<connection>scm:git:ssh://git#bitbucket.org/sergey_kozlov/playingcards.git</connection>
</scm>
<developers>
<developer>
<id>skozlov</id>
<name>Sergey Kozlov</name>
<email>mail.sergey.kozlov#gmail.com</email>
<roles>
<role>architect</role>
<role>developer</role>
</roles>
</developer>
</developers>
libraryDependencies += "junit" % "junit" % "4.11"
libraryDependencies += "org.scalatest" % "scalatest_2.10" % "2.0" % "test"
There's also ~/.sbt/0.13/plugins/gpg.sbt:
addSbtPlugin("com.typesafe.sbt" % "sbt-pgp" % "0.8")
No other files are under project/ directory that contribute to the build definition.
When I enter publishSigned in sbt console, I get the following error:
[error] (*:publishSigned) java.io.IOException: Access to URL https://oss.sonatype.org/content/repositories/snapshots/playingcards/playingcards_2.10/0.1-SNAPSHOT/playingcards_2.10-0.1-SNAPSHOT-sources.jar was refused by the server: Forbidden
Note that the URL does not contain organization.
How can I publish my artifact correctly?
As you pointed out your URL is missing organization property this is why you get this error. Try to run show organization in sbt console to be sure that your organization property is correct. If it doesn't help try to specify your project explicitly in sbt and set organization property there.
lazy val core = (project in file(".")).settings(
organization := "org.bitbucket.sergey_kozlov"
//other properties here
)
This is a pretty noob question.
I'm trying to learn about SparkSQL. I've been following the example described here:
http://spark.apache.org/docs/1.0.0/sql-programming-guide.html
Everything works fine in the Spark-shell, but when I try to use sbt to build a batch version, I get the following error message:
object sql is not a member of package org.apache.spark
Unfortunately, I'm rather new to sbt, so I don't know how to correct this problem. I suspect that I need to include additional dependencies, but I can't figure out how.
Here is the code I'm trying to compile:
/* TestApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
case class Record(k: Int, v: String)
object TestApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._
val data = sc.parallelize(1 to 100000)
val records = data.map(i => new Record(i, "value = "+i))
val table = createSchemaRDD(records, Record)
println(">>> " + table.count)
}
}
The error is flagged on the line where I try to create a SQLContext.
Here is the content of the sbt file:
name := "Test Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0"
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
Thanks for the help.
As is often the case, the act of asking the question helped me figure out the answer. The answer is to add the following line in the sbt file.
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.0.0"
I also realized there is an additional problem in the little program above. There are too many arguments in the call to createSchemaRDD. That line should read as follows:
val table = createSchemaRDD(records)
Thanks! I ran into a similar problem while building a Scala app in Maven. Based on what you did with SBT, I added the corresponding Maven dependencies as follows and now I am able to compile and generate the jar file.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>1.2.1</version>
</dependency>
I got the similar issue, in my case, i just copy pasted the below sbt setup from online with scalaVersion := "2.10.4" but in my environment, i actually have the scala version 2.11.8
so updated & executed sbt package again, issue fixed
name := "Test Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0"
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
Has anyone published an sbt-native-packager produced artifact (tgz in my case) using sbt-aether-deploy to a nexus repo? (I need this for the timestamped snapshots, specifically the "correct" version tag in nexus' artifact-resolution REST resource).
I can do one or the other but can't figure out how to add the packagedArtifacts in Universal to the artifacts that sbt-aether-deploy deploys to do both.
I suspect the path to pursue would be to the addArtifact() the packagedArtifacts in Universal or creating another AetherArtifact and then to override/replace the deployTask to use that AetherArtifact?
Any help much appreciated.
I am the author of the sbt-aether-deploy plugin, and I just came over this post.
import aether.AetherKeys._
crossPaths := false //needed if you want to remove the scala version from the artifact name
enablePlugins(JavaAppPackaging)
aetherArtifact := {
val artifact = aetherArtifact.value
artifact.attach((packageBin in Universal).value, "dist", "zip")
}
This will also publish the other main artifact.
If you want to disable publishing of the main artifact, then you will need to rewrite the artifact coordinates. Maven requires a main artifact.
I have added a way to replace the main artifact for this purpose, but I can now see that way is kind of flawed. It will still assume that the artifact is published as a jar file. The main artifact type is locked down to that, since the POM packaging is set to jar by default by SBT.
If this is an app, then that limitation is probably OK, since Maven will never resolve that into an artifact.
The "proper" way in Maven terms is to add a classifier to the artifact and change the "packaging" in the POM file to "pom". We will see if I get around to changing that particular part.
Ok, I think I got it amazingly enough. If there's a better way to do it I'd love to hear. Not loving that blind Option.get there..
val tgzCoordinates = SettingKey[MavenCoordinates]("the maven coordinates for the tgz")
lazy val myPackagerSettings = packageArchetype.java_application ++ deploymentSettings ++ Seq(
publish <<= publish.dependsOn(publish in Universal),
publishLocal <<= publishLocal.dependsOn(publishLocal in Universal)
)
lazy val defaultSettings = buildSettings ++ Publish.settings ++ Seq(
scalacOptions in Compile ++= Seq("-encoding", "UTF-8", "-target:jvm-1.7", "-deprecation", "-feature", "-unchecked", "-Xlog-reflective-calls"),
testOptions in Test += Tests.Argument("-oDF")
)
lazy val myAetherSettings = aetherSettings ++ aetherPublishBothSettings
lazy val toastyphoenixProject = Project(
id = "toastyphoenix",
base = file("."),
settings = defaultSettings ++ myPackagerSettings ++ myAetherSettings ++ Seq(
name in Universal := name.value + "_" + scalaBinaryVersion.value,
packagedArtifacts in Universal ~= { _.filterNot { case (artifact, file) => artifact.`type`.contains("zip")}},
libraryDependencies ++= Dependencies.phoenix,
tgzCoordinates := MavenCoordinates(organization.value + ":" + (name in Universal).value + ":tgz:" + version.value).get,
aetherArtifact <<= (tgzCoordinates, packageZipTarball in Universal, makePom in Compile, packagedArtifacts in Universal) map {
(coords: MavenCoordinates, mainArtifact: File, pom: File, artifacts: Map[Artifact, File]) =>
createArtifact(artifacts, pom, coords, mainArtifact)
}
)
)
I took Peter's solution and reworked it slightly, avoiding the naked Option.get by creating the MavenCoordinates directly:
import aether.MavenCoordinates
import aether.Aether.createArtifact
name := "mrb-test"
organization := "me.mbarton"
version := "1.0"
crossPaths := false
packageArchetype.java_application
publish <<= (publish) dependsOn (publish in Universal)
publishLocal <<= (publishLocal) dependsOn (publishLocal in Universal)
aetherPublishBothSettings
aetherArtifact <<= (organization, name in Universal, version, packageBin in Universal, makePom in Compile, packagedArtifacts in Universal) map {
(organization, name, version, binary, pom, artifacts) =>
val nameWithoutVersion = name.replace(s"-$version", "")
createArtifact(artifacts, pom, MavenCoordinates(organization, nameWithoutVersion, version, None, "zip"), binary)
}
The nameWithoutVersion replace works around SBT native packager including the version in the artifact name:
Before: me/mbarton/mrb-test-1.0/1.0/mrb-test-1.0.zip
After: me/mbarton/mrb-test/1.0/mrb-test-1.0.zip
crossPaths avoids the Scala postfix on the version.