Submit topology to storm cluster through streamparse - jar

I am trying to use streamparse to develop and submit the topologies to the storm cluster.
Since streamparse has its default wordcount topology to help user test the cluster, most of the tutorials I could find online is about submitting this default wordcount example to the storm clusters.
My question is how to submit my own topologies? For example, I have a topology named 'mytopology'. Per streamparse's document, I tried
sparse submit --environment prod --name mytopology
and my config file is
{
"serializer": "json",
"topology_specs": "topologies/",
"virtualenv_specs": "virtualenvs/",
"envs": {
"prod": {
"user": "userx",
"ssh_password": "mypasswd",
"nimbus": "10.XXX.XX.210",
"workers": ["10.XXX.XX.206"],
"log": {
"path": "/home/userx/stormapp/splog",
"max_bytes": 1000000,
"backup_count": 10,
"level": "info"
},
"virtualenv_root": "/home/userx/stormapp/venv"
}
}
}
However, the log showed that
JAR created: _build/wordcount-0.0.1-SNAPSHOT.jar
was created and submitted to Nimbus.
Isn't the
--name mytopology
supposed to find the mytopology.py and build something like mytopology.jar and submit that?
Then I checked the project.clj file, the top line is
defproject wordcount "0.0.1-SNAPSHOT"
Now it is confusing. Should I also configure this file? When I do
sparse submit --environment prod --name mytopology
Does it do something that is related to this file? Please help...

I suppose that your have first creating your wordcount project using the following command : sparse quickstart wordcount
In this case, "wordcount" will be the name of the topology that will be submitted to Storm using the sparse run command.
Now if you want to submit another topology, say mytopology, you have to create another quickstart project called mytopology and edit the config.json file to suit your technical environment. You cannot just copy and rename the "wordcount" project's folder as I guess you've done because "wordcount" appears in your project.clj file.

Related

How to avoid Jupyter cell-ids from changing all the time and thereby spamming the VCS diffs?

As discussed in q/66678305, newer Jupyter versions store in addition to the source code and output of cells an ID for the purpose of e.g. linking to a cell.
However, these IDs aren't stable but often change even when the cell's source code was not touched. As a result, if you have the .ipynb file under version control with e.g. git, the commits end up having lots of rather funny sounding “changed lines” that don't correspond to any actual change made in the commit. Like,
{
"cell_type": "code",
"execution_count": null,
- "id": "respected-breach",
+ "id": "incident-winning",
"metadata": {},
"outputs": [],
Is there a way to prevent this?
Answer for Git on Linux. Probably also works on MacOS, but not Windows.
It is good practice to not VCS the .ipynb files as saved by Jupyter, but instead a filtered version that does not contain all the volatile information. For this purpose, various git hooks are available; the one I'm using is based on https://github.com/toobaz/ipynb_output_filter/blob/master/ipynb_output_filter.py.
Strangely enough, it turns out this script can not be modified to remove the "id" field from cells. Namely, if you try to remove that field in the filtering loop, like with
for field in ("prompt_number", "execution_number", "id"):
if field in cell:
del cell[field]
then the write function from jupyter_nbformat will just put an id back in. It is possible to merely change the id to something constant, but then Jupyter will complain about nonunique ids.
As a hack to circumvent this, I now use this filter with a simple grep to delete the ID:
#!/bin/bash
grep -v '^ *"id": "[a-z\-]*",$'
Store that in e.g. ~/bin/ipynb_output_filter.sh, make it executable (chmod +x ~/bin/ipynb_output_filter.sh) and ensure you have the following ~/.gitattributes file:
*.ipynb filter=dropoutput_ipynb
and in your git config (either global ~/.gitconfig or project)
[core]
attributesfile = ~/.gitattributes
[filter "dropoutput_ipynb"]
clean = ~/bin/ipynb_output_filter.sh
smudge = cat
If you want to use a standard python filter in addition to that, you can invoke it before the grep in ~/bin/ipynb_output_filter.sh, like
#!/bin/bash
~/bin/ipynb_output_filter.py | grep -v '^ *"id": "[a-z\-]*",$'

Can we create cypress test suites by bundling different *_spec.js files together?

Current situation:
We are using cypress for test automation. We have a folder named 'integration' which contains several 'spec' files. These spec files can contain one or more tests related to each other.
Problem:
I want to organize the cypress test automation on bamboo properly. What I want to do is have test suites e.g.
Playground_suite contains: 1) slide_tests_spec.js
2) teeter_totters_tests_spec.js ...
Road_suite contains: 1) car_tests_spec.js 2) truck_tests_spec.js ...
The I have the option of running Playground_suite that will only run the spec files defined in this suite.
Is this possible in cypress, if yes, how? Please help.
We had faced this same type of issue. What we had come up to solve the same issue was the following:
00_suite.example.js:
import Test01 from './e2e_test01.example.js';
import Test02 from './e2e_test02.example.js';
import Test03 from './e2e_test03.example.js';
describe('Cypress_PreTest_Configuration', function() {
console.log(Cypress.config());
});
// This is an example suite running tests in a specified order -
// All tests contained in each of these files will be run before the next
// file is processed.
describe('Example_E2E_Test_Suite', function() {
Test01();
Test02();
Test03();
});
describe('Example_Reverse_Ordered_E2E_Test_Suite', function() {
Test03();
Test02();
Test01();
});
The key in the actual test files is that they contain the "export default function() {}" option prior to the describe suite definition(s):
e2e_test01.example.js:
export default function() {
describe('Example_Tests_01', function() {
it('TC01 - Example Tiger Tests', function() {
doNothingOne();
console.log(this.test.parent.parent.title);
cy.visit(this.commonURIs.loginURI);
})
})
}
When attempting to run the e2e_test*.example.js files within the Cypress UI, you will find that the UI will report that there are no tests found. You will have to execute the tests through through the suite definition files. We had approached this limitation with only using the 'suite' approach for E2E tests while we utilize the standard spec files for regression and minimum acceptance testing.
I hope that this example is helpful for you and perhaps someone else may have an other solution.

How to list subfolders in Artifactory

I'm trying to write a script which cleans up old builds in my generic file repository in Artifactory. I guess the first step would be to look in the repository and check which builds are in there.
Each build shows up as a subfolder of /foo, so for example I have folders /foo/123, /foo/124, /foo/125/, etc.
There doesn't seem to be a ls or dir command. So I tried the search command:
jfrog rt search my-repo/foo/*
But this recursively lists all files, which is not what I'm looking for. I just need the list of direct subfolders. I also tried
jfrog rt search my-repo/foo/* --recursive=false
but this doesn't return any results, because the search command only returns files, not folders.
How do I list the subfolders of a given folder in an Artifactory repository?
Just one more way to do it with curl and jq
curl -s http://myatifactory.domain:4567/artifactory/api/storage/myRepo/myFolder | jq -r '.children[] |select(.folder==true) |.uri'
Explanation: Curl is used to get the folder info and that is piped to JQ which then displays all the uri keys of the children array whose folder key has value true.
Just for easier understanding - the json that curl gets looks something like this (example from artifactory docs)
{
"uri": "http://localhost:8081/artifactory/api/storage/libs-release-local/org/acme",
"repo": "libs-release-local",
"path": "/org/acme",
"created": ISO8601 (yyyy-MM-dd'T'HH:mm:ss.SSSZ),
"createdBy": "userY",
"lastModified": ISO8601 (yyyy-MM-dd'T'HH:mm:ss.SSSZ),
"modifiedBy": "userX",
"lastUpdated": ISO8601 (yyyy-MM-dd'T'HH:mm:ss.SSSZ),
"children": [
{
"uri" : "/child1",
"folder" : "true"
},{
"uri" : "/child2",
"folder" : "false"
}
]
}
and for it the output of the command would be /child1.
Of course here it's assumed that artifactory repo myRepo allows anonymous read.
You should have a look to AQL (Artifactory Query Langage) here : https://www.jfrog.com/confluence/display/RTF/Artifactory+Query+Language
as an example the following AQL will retrieve all folders located in "my-repo" under "foo" folder and will display the result ordered by folder's name :
items.find(
{
"type":"folder",
"repo":{"$eq":"my-repo"},
"path":{"$eq":"foo"}
}
)
.include("name")
.sort({"$desc":["name"]})
For cleanup you can also have a look at the following example which gives a list of the 10 biggest artifacts created more than a month ago that have never been downloaded :
items.find(
{
"type":"file",
"repo":{"$eq":"my-repo"},
"created":{"$before":"1mo"},
"stat.downloads":{"$eq":null}
}
)
.include("size","name")
.sort({"$desc":["size"]})
.limit(10)
Based on jroquelaure's answer, I ended up with the following. The key thing that was still missing was that you have to convert the "items.find" call into JSON when putting it in a filespec. There is an example of that in the filespec documentation which I missed at first.
I put this JSON in a test.aql file:
{
"files":
[
{
"aql":
{
"items.find" :
{
"type":"folder",
"repo":{"$eq":"my-repo"},
"path":{"$eq":"foo"}
}
}
}
]
}
Then I call jfrog rt search --spec=test.aql.
The jfrog cli now includes the --include-dirs option for search.
The command:
jf rt search --recursive=false --include-dirs path/
will essentially act like an ls.
By default, it searches for files, if you want to list directories, add one more property --include-dirs
Refer the link for additional parameters. jfrog search
Here is the command.
jf rt search --recursive=false --include-dirs=true path/
Response:
[
{
"path": "artifactory-name/path",
"type": "folder",
"created": "",
"modified": ""
}
]
A cleaner approach is to tell Artifactory about builds, and let it discard old ones.
There are 3 parts to this. My examples are for the jfrog command line utility:
When uploading files with the "jfrog rt upload" command, use the --build-name someBuildName and --build-number someBuildNumber arguments. This links the uploaded files to a certain build.
After uploading files, publish the build with "jfrog rt build-publish someBuildName someBuildNumber"
To clean up all but the 3 latest builds, use "jfrog rt build-discard --max-builds=3 someBuildName"

Debugging bitbake pkg_postinst_${PN}: Append to config-file installed by other recipe

I'm writing am openembedded/bitbake recipe for openembedded-classic. My recipe RDEPENDS on keyutils, and everything seems to work, except one thing:
I want to append a single line to the /etc/request-key.conf file installed by the keyutils package. So I added the following to my recipe:
pkg_postinst_${PN} () {
echo 'create ... more stuff ..' >> ${sysconfdir}/request-key.conf
}
However, the intended added line is missing in my resulting image.
My recipe inherits update-rc.d if that makes any difference.
My main question is: How do i debug this? Currently I am constructing an entire rootfs image, and then poke-around in that to see, if the changes show up. Surely there is a better way?
UPDATE:
Changed recipe to:
pkg_postinst_${PN} () {
echo 'create ... more stuff ...' >> ${D}${sysconfdir}/request-key.conf
}
But still no luck.
As far as I know, postinst runs at rootfs creation, and only at first boot if rootfs fails.
So there is a easy way to execute something only first boot. Just check for $D, like this:
pkg_postinst_stuff() {
#!/bin/sh -e
if [ x"$D" = "x" ]; then
# do something at first boot here
else
exit 1
fi
}
postinst scripts are ran at roots time, so ${sysconfdir} is /etc on your host. Use $D${sysconfdir} to write to the file inside the rootfs being generated.
OE-Classic is pretty ancient so you really should update to oe-core.
That said, Do postinst's run at first boot? I'm not sure. Also look in the recipes work directory in the temp directory and read the log and run files to see if there are any clues there.
One more thing. If foo RDEPENDS on bar that just means "when foo is installed, bar is also installed". I'm not sure it makes assertions about what is installed during the install phase, when your postinst is running.
If using $D doesn't fix the problem try editing your postinst to copy the existing file you're trying to edit somewhere else, so you can verify that it exists in the first place. Its possible that you're appending to a file that doesn't exist yet, and the the package that installs the file replaces it.

Does sbt have something like gradle's processResources task with ReplaceTokens support?

We are moving into Scala/SBT from a Java/Gradle stack. Our gradle builds were leveraging a task called processResources and some Ant filter thing named ReplaceTokens to dynamically replace tokens in a checked-in .properties file without actually changing the .properties file (just changing the output). The gradle task looks like:
processResources {
def whoami = System.getProperty( 'user.name' );
def hostname = InetAddress.getLocalHost().getHostName()
def buildTimestamp = new Date().format('yyyy-MM-dd HH:mm:ss z')
filter ReplaceTokens, tokens: [
"buildsig.version" : project.version,
"buildsig.classifier" : project.classifier,
"buildsig.timestamp" : buildTimestamp,
"buildsig.user" : whoami,
"buildsig.system" : hostname,
"buildsig.tag" : buildTag
]
}
This task locates all the template files in the src/main/resources directory, performs the requisite substitutions and outputs the results at build/resources/main. In other words it transforms src/main/resources/buildsig.properties from...
buildsig.version=#buildsig.version#
buildsig.classifier=#buildsig.classifier#
buildsig.timestamp=#buildsig.timestamp#
buildsig.user=#buildsig.user#
buildsig.system=#buildsig.system#
buildsig.tag=#buildsig.tag#
...to build/resources/main/buildsig.properties...
buildsig.version=1.6.5
buildsig.classifier=RELEASE
buildsig.timestamp=2013-05-06 09:46:52 PDT
buildsig.user=jenkins
buildsig.system=bobk-mbp.local
buildsig.tag=dev
Which, ultimately, finds its way into the WAR file at WEB-INF/classes/buildsig.properties. This works like a champ to record build specific information in a Properties file which gets loaded from the classpath at runtime.
What do I do in SBT to get something like this done? I'm new to Scala / SBT so please forgive me if this seems a stupid question. At the end of the day what I need is a means of pulling some information from the environment on which I build and placing that information into a properties file that is classpath loadable at runtime. Any insights you can give to help me get this done are greatly appreciated.
The sbt-buildinfo is a good option. The README shows an example of how to define custom mappings and mappings that should run on each compile. In addition to the straightforward addition of normal settings like version shown there, you want a section like this:
buildInfoKeys ++= Seq[BuildInfoKey](
"hostname" -> java.net.InetAddress.getLocalHost().getHostName(),
"whoami" -> System.getProperty("user.name"),
BuildInfoKey.action("buildTimestamp") {
java.text.DateFormat.getDateTimeInstance.format(new java.util.Date())
}
)
Would the following be what you're looking for:
sbt-editsource: An SBT plugin for editing files
sbt-editsource is a text substitution plugin for SBT 0.11.x and
greater. In a way, it’s a poor man’s sed(1), for SBT. It provides the
ability to apply line-by-line substitutions to a source text file,
producing an edited output file. It supports two kinds of edits:
Variable substitution, where ${var} is replaced by a value. sed-like
regular expression substitution.
This is from Community Plugins.

Resources