How to pass external jar through the commnadline while running MapReduce? - hadoop-streaming

I use nltk within Python MapReduce program and use the below command to execute it.
I have found out that I am not able to pass nltk correctly along with the command. Could anyone let me know what is the correct syntax? Thanks.

Let me attempt to provide an answer. Please get back to me if it doesn't work for you.
May be you can try the following. Since, you are already using the -file option to pass the Mapper.py, using only ussing -mapper Mapper.py should do and try to use -libjars instead of -archives, if you need the classes inside nltk.jar in classpath.
hadoop jar /usr/lib/gphd/hadoop-mapreduce-2.0.2_alpha_gphd_2_0_1_0/hadoop-streaming-2.0.2-alpha-gphd-2.0.1.0.jar \
-libjars senti-data/nltk.jar \
-file senti-data/traintweets.csv \
-file senti-data/stopwords.txt \
-file /home/cduser/senti-data/Mapper.py \
-mapper Mapper.py \
-input senti-data/inputtweets.txt \
-output output

Related

How to fix asdf error when using buildapp on a quicklisp project

I've been making my first quicklisp project lately and I wanted to share it. I've put it on github, but not everyone has emacs + slime + quicklisp installed so I wanted to make an executable I could put with the code.
To do this I'm using buildapp and following the steps laid out in this stackoverflow answer.
$ sbcl --no-userinit --no-sysinit --non-interactive \
--load ~/quicklisp/setup.lisp \
--eval '(ql:quickload "ltk-colorpicker")' \
--eval '(ql:write-asdf-manifest-file "quicklisp-manifest.txt")'
$ buildapp --output out \
--manifest-file quicklisp-manifest.txt \
--load-system ltk-colorpicker \
--entry colorpicker
After running those commands I get the following error:
Fatal INPUT-ERROR-IN-LOAD:
READ error during LOAD:
The symbol "*SYSTEM-DEFINITION-SEARCH-FUNCTIONS*" is not external in the ASDF/FIND-SYSTEM package.
Line: 16, Column: 90, File-Position: 15267
Stream: #<SB-INT:FORM-TRACKING-STREAM for "file /home/nathan/quicklisp/local-projects/ltk-colorpicker/dumper-2SKVI5f7.lisp" {1001B70F83}>
The main problem here is that I don't even have a clue at how to begin to fix it. I've seen this gibhub issue, but that had to do with problems with Homebrew and it never even mentions buildapp. It's all very confusing. And I hope I could get some help.
Thanks in advance for any answers.
I can reproduce the error. As suggested in the comments, you can build an up-to-date version of buildapp as follows:
$ sbcl
* (ql:quickload :buildapp)
...
* (buildapp:build-buildapp
(merge-pathnames "bin/buildapp" (user-homedir-pathname)))
This build $HOME/bin/buildapp. When I use the new binary, there is no error anymore.
You can also avoid generating an executable (that can end up being outdated) by systematically calling the buildapp::main function from Common Lisp; you will then always have the version that corresponds to the current release of quicklisp:
* (buildapp::main
'("BUILDAPP" ;; argv[0] must exist but the value is not important
"--manifest-file" "/tmp/quicklisp-manifest.txt"
"--load-system" "drakma" "--output" "/tmp/test"))
Some extra info from my point:
The solution was to use the newest version of buildapp as #coredump mentioned. I updated by going to the github page, downloading the zip and doing the following commands at the point where buildapp is stored.
$ make
$ cp buildapp /usr/bin
(This of course only works on linux.)
This is not an elegant solution but buildapp hasn't updated in 4 years, I think it's a safe enough bet. I also made a mistake with the command. The --entry part is wrong. It should have been: `--entry ltk-colorpicker::main`` where main is a function that takes one variable since that's required by the spec.
Main is just this: (main (i) (declare (ignore i)) (colorpicker))

Is it possible to use javapackager on ZuluFX for Mac

I was able to use ZuluFX 8 with javapackager on Windows. However, on a Mac I get this error:
Bundler Mac Application Image skipped because of a configuration problem: Cannot determine which JRE/JDK exists in the specified runtime directory.
Advice to fix: Point the runtime directory to one of the JDK/JRE root, the Contents/Home directory of that root, or the Contents/Home/jre directory of the JDK.
It's pretty easy to just move the package into Contents/Home but I doubt that will work as it seems there is no JRE bundled with the Mac version of ZuluFX 8. Is this something that can be worked around?
It's pretty easy to just move the package into Contents/Home but I doubt that will work as it seems there is no JRE bundled with the Mac version of ZuluFX 8.
From what I'm seeing, I'm not sure that's correct. The ZuluFx 8 archive for Mac contains a jre directory. I extracted the archive to ~/zuluFX and from there created the Contents/Home directory as required by MacOS and added a symbolic link to said jre directory there. I then set $JAVA_HOME accordingly:
$ pwd
/Users/cody/zuluFX
$ mkdir -p Contents/Home
$ ln -s ../../jre .
$ export JAVA_HOME=~/zuluFX
Then I utilized a simple javapackager example on github to test its usage (I have no other JREs/JDKs installed on this box). The example app simply dumps Java properties and environment variables in a TextArea.
I had to modify the 3build script in the example to comment out its attempt to re-set $JAVA_HOME, but otherwise, it builds successfully, with the following javapackager command:
javapackager \
-deploy -Bruntime=${JAVA_HOME} \
-native image \
-srcdir . \
-srcfiles MacJavaPropertiesApp.jar \
-outdir release \
-outfile ${APP_DIR_NAME} \
-appclass MacJavaPropertiesApp \
-name "MacJavaProperties" \
-title "MacJavaProperties" \
-nosign \
-v
When I launch the resulting app, it reports the usage of the azul/zulu jre as expected:

How to generate some models for java with OpenApi Generator?

I successfully did generate a REST Client in java from a Swagger/OpenApi v2.0 using OpenApi Generator CLI 3.3.2-SNAPSHOT
But I already have a REST Client, so I just want to generate some models from the spec.
I get success when I run:
java -Dmodels -DmodelDocs=false \
-jar modules/openapi-generator-cli/target/openapi-generator-cli.jar generate \
-i swagger.json \
-g java \
-o /temp/my_models
But when I want to generate just specific models with
java -Dmodels=Body,Header -DmodelDocs=false \
-jar modules/openapi-generator-cli/target/openapi-generator-cli.jar generate \
-i swagger.json \
-g java
-o /temp/my_selected_models
I'm getting this ERROR:
[main] INFO o.o.c.languages.AbstractJavaCodegen - Environment
variable JAVA_POST_PROCESS_FILE not defined so the Java code may not
be properly formatted. To define it, try 'export
JAVA_POST_PROCESS_FILE="/usr/local/bin/clang-format -i"' (Linux/Mac)
What is this JAVA_POST_PROCESS_FILE and how can I specify a valid format to generate the models?
Why the code generation success with all models but fails with a subset?
That message is just informational. It aims to inform you that there's a way to auto-format the auto-generated Java code by specifying an environment variable with the auto code formatter (clang_format in this case):
export JAVA_POST_PROCESS_FILE="/usr/local/bin/clang-format -i"
In other words, it does not affect the code generation process if the environment variable is not specified.

running pyspark kafka steam with an error

When I tried to run an example code for spark-steaming: "kafka_wordcount.py"
under the folder: /usr/local/spark/examples/src/main/python/streaming
The code explicitly describes the instruction to execute the code as:
" $ bin/spark-submit --jars \
external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar \
examples/src/main/python/streaming/kafka_wordcount.py \
localhost:2181 test
test is the topic name. But I cannot find the jar and the path:
" external/kafka-assembly/target/scala-/spark-streaming-kafka-assembly-.jar"
So instead I created a folder "streaming/jar/" and put all jars from the
website http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22spark-streaming-kafka-assembly_2.10%22 and then when I run
"park-submit --jars ~/stream-example/jars/spark-streaming-kafka-assembly_*.jar kafka_wordcount.py localhost:2181 topic"
which shows
"Error: No main class set in JAR; please specify one with --class
Run with --help for usage help or --verbose for debug output"
What is wrong with that? Where are jars?
A ton of Thanks!!
This question was asked long ago, so I assume you have figured out by now.
But, as I just had the same problem, I will post the solution that worked for me.
The deployment section of this guide (http://spark.apache.org/docs/latest/streaming-kafka-integration.html) says you can pass the lib with the --packages argument, like bellow:
bin/spark-submit \
--packages org.apache.spark:spark-streaming-kafka_2.10:1.6.2 \
examples/src/main/python/streaming/kafka_wordcount.py \
localhost:2181 test
You can also download the jar itself here: http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22spark-streaming-kafka-assembly_2.10%22
Note: I didn't ran the command above, I tested with this other example, but it should work the same way:
bin/spark-submit
--packages org.apache.spark:spark-streaming-kafka_2.10:1.6.2 \
examples/src/main/python/streaming/direct_kafka_wordcount.py \
localhost:9092 test

How to set QT_QPA_PLATFORM_PLUGIN_PATH properly (concept)?

I have Qt Creator and Qt 5.5 installed.
QT_QPA_PLATFORM_PLUGIN_PATH = C:\Qt\5.5\msvc2013\plugins
If I disable the environment var, I do get an error when I launch an application from QtC. So the variable seems to be required.
My problem is:
When I run other Qt based applications (i.e. Teamspeak or such), those fail, I always have to disable (delete) QT_QPA_PLATFORM_PLUGIN_PATH first
When I use KITS in QtC and switch between Qt versions (i.e. 5.4, 5.6) the variable is not in sync with this very version
How is this supposed to work?
The best solution I have found so far is to set it on the QtC Project page for that specific build
My decision that helped me. It:
In the search for Win 10 enter sysdm.cpl
Advanced -> Environment Variables -> to System Variables -> add:
PATH
C: \ Users \ ~ \ AppData \ Local \ Programs \ Python \ Python36-32 \ Lib \ site-packages \ pyqt5_tools \ plugins \ platforms \ (your address to qminimal.dll, qoffscreen.dll, qwebgl.dll)
dll took from here: https://www.riverbankcomputing.com/software/pyqt/download5 official site

Resources