This example shows an exatract of an output file much bigger t
xmlstarlet fo jira-output.xml | egrep 'hours|username|worklog|work_date' | egrep -v 'external|time' | head -20
Gives me approximately this:
#<worklogs date_from="2014-06-01 00:00:00" date_to="2014-06-30 23:59:59" number_of_worklogs="222" format="xml" diffOnly="false" errorsOnly="false" validOnly="false" addBillingInfo="false" addIssueSummary="false" addIssueDescription="false" duration_ms="106" headerOnly="false" userName="" addIssueDetails="false" addParentIssue="false" addUserDetails="false" addWorklogDetails="false" billingKey="" issueKey="" projectKey="">
# <worklog>
# <worklog_id>15650</worklog_id>
# <hours>0.11666667</hours>
# <work_date>2014-06-07</work_date>
# <username>cadalso</username>
# </worklog>
# <worklog>
# <worklog_id>15653</worklog_id>
# <hours>0.2</hours>
# <work_date>2014-06-07</work_date>
# <username>cadalso</username>
# </worklog>
# <worklog>
# <worklog_id>15941</worklog_id>
# <hours>4.0</hours>
# <work_date>2014-06-17</work_date>
# <username>mrjcleaver</username>
# </worklog>
# <worklog>
#</worklogs>
This executes nicely, totalling
xmlstarlet sel -T -t -v "sum(worklogs/worklog/hours)" --nl jira-output.xml
This total is different, but only because XML file has many more rows in it
4.31666667
But the following
xmlstarlet sel -T -t -m /worklogs/worklog/worklog_id -v "concat('|',/worklogs/worklog/staff_id,' | ', /worklogs/worklog/worklog_id,' | ',/worklogs/worklog/work_date,' | ',/worklogs/worklog/hours,' |')" --nl jira-output.xml
Shows:
#| cadalso | 15650 | 2014-06-07 | 0.11666667 |
#| cadalso | 15650 | 2014-06-07 | 0.11666667 |
#| cadalso | 15650 | 2014-06-07 | 0.11666667 |
#... one for each row, but with the wrong values
Whereas what I want would be:
#| cadalso | 15650 | 2014-06-07 | 0.11666667 |
#| cadalso | 15653 | 2014-06-07 | 0.2 |
#| mrjcleaver | 15941 | 2014-06-17 | 4.0 |
What am I doing wrong?
Thanks, M.
Big thanks to npostavs, the answer was:
xmlstarlet sel -T -t -m /worklogs/worklog -v "concat('|',staff_id,' | ', worklog_id,' | ',work_date,' | ',hours,' |')" --nl jira-output.xml
Related
I have a dataset in which I paste values in a dplyr chain and collapse with the pipe character (e.g. " | "). If any of the values in the dataset are blank, I just get recurring pipe characters in the pasted list.
Some of the values look like this, for example:
badstring = "| | | | | | GHOULSBY,SCROGGINS | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CAT,JOHNSON | | | | | | | | | | | | BURGLAR,PALA | | | | | | | | |"
I want to match all the pipes that occur more than once and delete them, so that just the names appear like so:
correctstring = "| GHOULSBY,SCROGGINS | CAT,JOHNSON | |BURGLAR,PALA |"
I tried the following, but to no avail:
mutate(names = gsub('[\\|]{2,}', '', name_list))
The difficulty in this question is in formulating a regex which can selectively remove every pipe, except the ones we want to remain as actual separators between terms. We can match on the following pattern:
\|\s+(?=\|)
and then replace just empty string. This pattern will remove any pipe (and any following whitespace) so long as what follows is another pipe. A removal would not occur when a pipe is followed by an actual term, or when it is followed by the end of the string.
badstring = "| | | | | | GHOULSBY,SCROGGINS | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CAT,JOHNSON | | | | | | | | | | | | BURGLAR,PALA | | | | | | | | |"
result <- gsub("\\|\\s+(?=\\|)", "", badstring, perl=TRUE)
result
[1] "| GHOULSBY,SCROGGINS | CAT,JOHNSON | BURGLAR,PALA |"
Demo
Edit:
If you expect to have inputs like | | | which are devoid of any terms, and you would expect empty string as the output, then my solution would fail. I don't see an obvious way to modify the above regex, but you can handle this case with one more call to sub:
result <- sub("^\\|$", "", result)
We also might be able to modify the original pattern to use an alternation covering all cases:
result <- gsub("\\|\\s+(?=\\|)|(?:^\\|$)", "", badstring, perl=TRUE)
I'am trying to test my RNG (written in VHDL, numbers stored in binary file) with the RDieHarder package. But when I call the dieharder() function twice (from R) this leads to
> dh <- dieharder(rng = 'file_input_raw', test = 'diehard_runs', inputfile = 'rand.bin')
# file_input_raw(): Error. This cannot happen.
[user#host ~]$
and drops me to my shell.
Setup
Die Harder Version:
$ dieharder -h
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
...
R Version:
$ R --version
R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
RDieHarder Version:
> packageVersion('RDieHarder')
[1] ‘0.1.3’
MWE
To reproduce this, we first generate an appropriate binary file with dieharder and run the die hard tests on it.
$ dieharder -o -O0 -f rand.bin -t 5000000 && dieharder -g file_input_raw -d diehard_runs -f rand.bin
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
rng_name | filename |rands/second|
file_input_raw| rand.bin| 4.49e+07 |
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
# The file file_input_raw was rewound 4 times
diehard_runs| 0| 100000| 100|0.77169947| PASSED
diehard_runs| 0| 100000| 100|0.68299332| PASSED
All tests where passed, fine. So we switch to R and do the same thing again and asume we want some more tests, like the STS Runs test.
> library('RDieHarder')
> dh <- dieharder(rng = 'file_input_raw', test = 'diehard_runs', verbose = TRUE, inputfile = 'rand.bin')
Dieharder called with gen=201 test=15 seed=2852951401
# 10000000 rands were used in this test
# The file file_input_raw was rewound 2 times
#==================================================================
# Diehard Runs Test
# This is the RUNS test. It counts runs up, and runs down,
# in a sequence of uniform [0,1) variables, obtained by float-
# ing the 32-bit integers in the specified file. This example
# shows how runs are counted: .123,.357,.789,.425,.224,.416,.95
# contains an up-run of length 3, a down-run of length 2 and an
# up-run of (at least) 2, depending on the next values. The
# covariance matrices for the runs-up and runs-down are well
# known, leading to chisquare tests for quadratic forms in the
# weak inverses of the covariance matrices. Runs are counted
# for sequences of length 10,000. This is done ten times. Then
# repeated.
#
# In Dieharder sequences of length tsamples = 100000 are used by
# default, and 100 p-values thus generated are used in a final
# KS test.
#==================================================================
# Run Details
# Random number generator tested: file_input_raw
# File rand.bin contains 5000000 rands of type.
# Samples per test pvalue = 100000 (test default is 100000)
# P-values in final KS test = 100 (test default is 100)
#==================================================================
# Histogram of p-values
##################################################################
# Counting histogram bins, binscale = 0.100000
# 20| | | | | | | | | | |
# | | | | | | | | | | |
# 18| | | | | | | | | | |
# | | | | | | | | | | |
# 16| | | | | | | | | | |
# | | | | | | | | | | |
# 14|****| | | | | | | |****| |
# |****| | | | | | | |****| |
# 12|****| |****| |****| |****| |****| |
# |****| |****| |****| |****| |****| |
# 10|****| |****| |****|****|****| |****| |
# |****| |****| |****|****|****| |****| |
# 8|****| |****|****|****|****|****|****|****| |
# |****| |****|****|****|****|****|****|****| |
# 6|****| |****|****|****|****|****|****|****|****|
# |****| |****|****|****|****|****|****|****|****|
# 4|****|****|****|****|****|****|****|****|****|****|
# |****|****|****|****|****|****|****|****|****|****|
# 2|****|****|****|****|****|****|****|****|****|****|
# |****|****|****|****|****|****|****|****|****|****|
# |--------------------------------------------------
# | 0.1| 0.2| 0.3| 0.4| 0.5| 0.6| 0.7| 0.8| 0.9| 1.0|
#==================================================================
# 10000000 rands were used in this test
# The file file_input_raw was rewound 2 times
#==================================================================
# Histogram of p-values
##################################################################
# Counting histogram bins, binscale = 0.100000
# 20| | | | | | | | | | |
# | | | | | | | | | | |
# 18| | | | | | | | | | |
# | | | | | | | | | | |
# 16| | | | | |****| | |****| |
# | | | | | |****| | |****| |
# 14| | | | | |****| | |****| |
# | | | | | |****| | |****| |
# 12| |****| | | |****| | |****| |
# | |****| | | |****| | |****| |
# 10|****|****| | |****|****| | |****| |
# |****|****| | |****|****| | |****| |
# 8|****|****|****| |****|****|****| |****|****|
# |****|****|****| |****|****|****| |****|****|
# 6|****|****|****|****|****|****|****|****|****|****|
# |****|****|****|****|****|****|****|****|****|****|
# 4|****|****|****|****|****|****|****|****|****|****|
# |****|****|****|****|****|****|****|****|****|****|
# 2|****|****|****|****|****|****|****|****|****|****|
# |****|****|****|****|****|****|****|****|****|****|
# |--------------------------------------------------
# | 0.1| 0.2| 0.3| 0.4| 0.5| 0.6| 0.7| 0.8| 0.9| 1.0|
#==================================================================
# 10000000 rands were used in this test
# The file file_input_raw was rewound 2 times
> summary(dh)
Diehard Runs Test
data: Created by RNG `file_input_raw' with seed=0, sample of size 100
p-value = 0.7717
Summary for test data
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.03151 0.23500 0.49970 0.49850 0.75030 0.94720
Stem and leaf plot for test data
The decimal point is 1 digit(s) to the left of the |
0 | 33667788889999
1 | 3399
2 | 0011333355
3 | 0022222266
4 | 222222446688
5 | 229999
6 | 0000224477778899
7 | 225555
8 | 0022446666888888
9 | 003355
NULL
One-sample Kolmogorov-Smirnov test
data: object$data
D = 0.06947, p-value = 0.6936
alternative hypothesis: two-sided
Wilcoxon signed rank test with continuity correction
data: object$data
V = 2507, p-value = 0.952
alternative hypothesis: true location is not equal to 0.5
Warning:
In ks.test(object$data, "punif", 0, 1, exact = TRUE) :
ties should not be present for the Kolmogorov-Smirnov test
> # calling dieharder() again yields an error
> dh <- dieharder(rng = 'file_input_raw', test = 'sts_runs', verbose = TRUE, inputfile = 'rand.bin')
Dieharder called with gen=201 test=101 seed=3570434269
# file_input_raw(): Error. This cannot happen.
[user#host ~]$
So if This cannot happen why does it happen?
I've removed my previous installation from dieharder (the dieharder package from the AUR) and build it from this repository, which is from the maintainer of RDieHarder. Now everything works fine.
EDIT
Switching from the package to the GitHub Repository doesn't really solve the problem. Seems like the error still occurs when the binary file needs to be rewound.
MWE
If we generate two new files
[user#host ~]$ dieharder -o -O0 -f rand_a.bin -t 10000000 && dieharder -g file_input_raw -d diehard_runs -f rand_a.bin
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
rng_name | filename |rands/second|
file_input_raw| rand_a.bin| 4.61e+07 |
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
# The file file_input_raw was rewound 1 times
diehard_runs| 0| 100000| 100|0.75253825| PASSED
diehard_runs| 0| 100000| 100|0.62883583| PASSED
[user#host ~]$ dieharder -o -O0 -f rand_b.bin -t 5000000 && dieharder -g file_input_raw -d diehard_runs -f rand_b.bin
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
rng_name | filename |rands/second|
file_input_raw| rand_b.bin| 4.35e+07 |
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
# The file file_input_raw was rewound 2 times
diehard_runs| 0| 100000| 100|0.01105242| PASSED
diehard_runs| 0| 100000| 100|0.48222156| PASSED
and run the same test with RDieHarder we get (not knowing why the second run doesn't need to be rewound)
> library('RDieHarder')
> dh <- dieharder(rng = 'file_input_raw', test = 'diehard_runs', verbose = FALSE, inputfile = 'rand_a.bin')
# The file file_input_raw was rewound 1 times
# 10000000 rands were used in this test
# The file file_input_raw was rewound 1 times
# 10000000 rands were used in this test
# The file file_input_raw was rewound 1 times
> dh <- dieharder(rng = 'file_input_raw', test = 'diehard_runs', verbose = FALSE, inputfile = 'rand_a.bin')
# 10000000 rands were used in this test
# The file file_input_raw was rewound 0 times
# 10000000 rands were used in this test
# The file file_input_raw was rewound 0 times
and with the shorter one
> library('RDieHarder')
> dh <- dieharder(rng = 'file_input_raw', test = 'diehard_runs', verbose = FALSE, inputfile = 'rand_b.bin')
# The file file_input_raw was rewound 2 times
# 10000000 rands were used in this test
# The file file_input_raw was rewound 2 times
# 10000000 rands were used in this test
# The file file_input_raw was rewound 2 times
> dh <- dieharder(rng = 'file_input_raw', test = 'diehard_runs', verbose = FALSE, inputfile = 'rand_b.bin')
# file_input_raw(): Error. This cannot happen.
[user#host ~]$
droping us back to the shell.
I have a project built with make, but I want to shift to scons.
However, I could not link object files in scons, so I want to know how to change working directory in scons.
What I exactly want is
make -C $(OBJECTDIRECTORY) -f $(SOURCEDIRECTORY)./Makefile InternalDependency
This is one line from my Makefile, and works well.
However, when scons builds my project, it does
x86_64-pc-linux-ld -o build/kernel32/kernel32.elf -melf_i386 -T scripts/elf_i386.x -nostdlib -e main -Ttext 0x10200 build/kernel32/asmUtils.o build/kernel32/cpu.o build/kernel32/main.o build/kernel32/memory.o build/kernel32/pageManager.o build/kernel32/utils.o
and got an error,
x86_64-pc-linux-ld: cannot find main.o
. Even thouh I do same command in shell manually, I got the same error.
However, if I move to build/kernel32, and do manually
x86_64-pc-linux-ld -o kernel32.elf -melf_i386 -T ../../elf_i386.x -nostdlib -e main -Ttext 0x10200 main.o cpu.o memory.o pageManager.o utils.o asmUtils.o
and it works.
My assumption is ld could not link object files in some upper directory.
So, is there any way to do like "-C" option of Make?
Or any other workaround wayin scons?
Here is my SConscript, and SConsctruct.
In project root directory,
#SConstruct
build_dir = 'build'
# Build
SConscript(['src/SConscript'], variant_dir = build_dir, duplicate = 0)
# Clean
Clean('.', build_dir)
In src directory
#SConscript for src
SConscript(['bootloader/SConscript',
'kernel32/SConscript'])
In kernel32 directory
#SConscript for kernel32
import os, sys
# Build entry
env_entry = Environment(tools=['default', 'nasm'])
target_entry = 'entry.bin'
object_entry = 'entry.s'
output_entry = env_entry.Object(target_entry, object_entry)
# Compile CPP
env_gpp_options = {
'CXX' : 'x86_64-pc-linux-g++',
'CXXFLAGS' : '-std=c++11 -g -m32 -ffreestanding -fno-exceptions -fno-rtti',
'LINK' : 'x86_64-pc-linux-ld',
'LINKFLAGS' : '-melf_i386 -T scripts/elf_i386.x -nostdlib -e main -Ttext 0x10200',
}
env_gpp = Environment(**env_gpp_options)
env_gpp.Append(ENV = {'PATH' : os.environ['PATH']})
object_cpp_list = Glob('*.cpp')
for object_cpp in object_cpp_list:
env_gpp.Object(object_cpp)
# Compile ASM
env_nasm = Environment(tools=['default', 'nasm'])
env_nasm.Append(ASFLAGS='-f elf32')
object_nasm_list = Glob('*.asm')
for object_nasm in object_nasm_list:
env_nasm.Object(object_nasm)
# Find all object file
object_target_list = Glob('*.o')
object_target_list.append('entry.bin')
# Linking
env_link_target = 'kernel32.elf'
env_gpp.Program(env_link_target, object_target_list)
Pleas let me know. Thank you.
The log for "--tree=prune" is
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
x86_64-pc-linux-ld -o build/kernel32/kernel32.elf -melf_i386 -T scripts/elf_i386.x -nostdlib -e main -Ttext 0x10200 build/kernel32/asmUtils.o build/kernel32/cpu.o build/kernel32/main.o build/kernel32/memory.o build/kernel32/pageManager.o build/kernel32/utils.o build/kernel32/entry.bin
+-.
+-SConstruct
+-build
| +-src/SConscript
| +-build/bootloader
| | +-src/bootloader/BootLoader.asm
| | +-src/bootloader/SConscript
| | +-build/bootloader/bootloader.bin
| | +-src/bootloader/BootLoader.asm
| | +-/usr/bin/nasm
| +-build/kernel32
| +-src/kernel32/SConscript
| +-src/kernel32/asmUtils.asm
| +-build/kernel32/asmUtils.o
| | +-src/kernel32/asmUtils.asm
| | +-/usr/bin/nasm
| +-src/kernel32/cpu.cpp
| +-build/kernel32/cpu.o
| | +-src/kernel32/cpu.cpp
| | +-src/kernel32/cpu.hpp
| | +-src/kernel32/types.hpp
| | +-/home/xaliver/BuildTools/cross/bin/x86_64-pc-linux-g++
| +-build/kernel32/entry.bin
| | +-src/kernel32/entry.s
| | +-/usr/bin/nasm
| +-src/kernel32/entry.s
| +-build/kernel32/kernel32.elf
| | +-[build/kernel32/asmUtils.o]
| | +-[build/kernel32/cpu.o]
| | +-build/kernel32/main.o
| | | +-src/kernel32/main.cpp
| | | +-src/kernel32/cpu.hpp
| | | +-src/kernel32/memory.hpp
| | | +-src/kernel32/types.hpp
| | | +-src/kernel32/utils.hpp
| | | +-src/kernel32/pageManager.hpp
| | | +-src/kernel32/page.hpp
| | | +-/home/xaliver/BuildTools/cross/bin/x86_64-pc-linux-g++
| | +-build/kernel32/memory.o
| | | +-src/kernel32/memory.cpp
| | | +-src/kernel32/memory.hpp
| | | +-src/kernel32/pageManager.hpp
| | | +-src/kernel32/page.hpp
| | | +-src/kernel32/types.hpp
| | | +-/home/xaliver/BuildTools/cross/bin/x86_64-pc-linux-g++
| | +-build/kernel32/pageManager.o
| | | +-src/kernel32/pageManager.cpp
| | | +-src/kernel32/pageManager.hpp
| | | +-src/kernel32/page.hpp
| | | +-src/kernel32/types.hpp
| | | +-/home/xaliver/BuildTools/cross/bin/x86_64-pc-linux-g++
| | +-build/kernel32/utils.o
| | | +-src/kernel32/utils.cpp
| | | +-src/kernel32/utils.hpp
| | | +-src/kernel32/types.hpp
| | | +-/home/xaliver/BuildTools/cross/bin/x86_64-pc-linux-g++
| | +-[build/kernel32/entry.bin]
| | +-/home/xaliver/BuildTools/cross/bin/x86_64-pc-linux-ld
| +-src/kernel32/main.cpp
| +-[build/kernel32/main.o]
| +-src/kernel32/memory.cpp
| +-[build/kernel32/memory.o]
| +-src/kernel32/pageManager.cpp
| +-[build/kernel32/pageManager.o]
| +-src/kernel32/utils.cpp
| +-[build/kernel32/utils.o]
+-src
+-src/SConscript
+-src/bootloader
| +-src/bootloader/BootLoader.asm
| +-src/bootloader/SConscript
+-src/kernel32
+-src/kernel32/SConscript
+-src/kernel32/asmUtils.asm
+-src/kernel32/cpu.cpp
+-src/kernel32/cpu.hpp
+-src/kernel32/entry.s
+-src/kernel32/main.cpp
+-src/kernel32/memory.cpp
+-src/kernel32/memory.hpp
+-src/kernel32/page.hpp
+-src/kernel32/pageManager.cpp
+-src/kernel32/pageManager.hpp
+-src/kernel32/types.hpp
+-src/kernel32/utils.cpp
+-src/kernel32/utils.hpp
scons: building terminated because of errors.
Is there a command line tool that takes lines of delimiter-separated values and arranges them in a SQL-style table? E.g.,
id,name
1,apple
2,banana
3,yogurt
into
id | name
----+---------
1 | apple
2 | banana
3 | yogurt
With perl and format statement :
Input file:
$ cat file.scv
id,name
1,apple
2,banana
3,yogurt
Code:
$ cat ./format-STDIN.pl
#!/usr/bin/env perl
use strict; use warnings;
sep();
while (<>) {
$. == 2 and sep();
format STDOUT =
|#<< | #<<<<<<<<<<<|
split /,/
.
write;
}
sep();
sub sep{ print "+----+-------------+\n"; }
Output:
$ ./format-STDIN.pl file.csv
+----+-------------+
|id | name |
+----+-------------+
|1 | apple |
|2 | banana |
|3 | yogurt |
+----+-------------+
I've got a bunch of .txt files that look like this:
# title: I Got Stripes
# artist: Johnny Cash
# metre: 4/4
# tonic: Db
0.000000000 silence
0.348299319 A, intro, | Cb:maj | Db:maj | Db:maj |, (guitar)
3.931269841 B, verse, | Db:maj | Db:maj | Ab:maj | Ab:maj |, (voice
8.662993197 | Ab:maj | Ab:maj | Db:maj | Db:maj |
# tonic: Eb
78.145873015 D, modulation, | Eb:maj | Eb:maj |, (guitar)
80.474625850 B, verse, | Eb:maj | Eb:maj | Bb:maj | Bb:maj |, (voice
85.104784580 | Bb:maj | Bb:maj | Eb:maj | Eb:maj |
and I need to convert them to something like this:
# title: I Got Stripes
# artist: Johnny Cash
# metre: 4/4
# tonic: Db
| Cb:maj | Db:maj | Db:maj |
| Db:maj | Db:maj | Ab:maj | Ab:maj |
| Ab:maj | Ab:maj | Db:maj | Db:maj |
# tonic: Eb
| Eb:maj | Eb:maj |
| Eb:maj | Eb:maj | Bb:maj | Bb:maj |
| Bb:maj | Bb:maj | Eb:maj | Eb:maj |
Specifically, that means:
Every line that starts with # needs to stay exactly the same
Every blank line (such as line 5 in my mock example) needs to stay there
For all other lines, every character that isn't enclosed by pipes ( | ) needs to be removed
I have +/- 700 files, in different subdirectories.
I was thinking of writing a sed script, but can't quite figure out how to do it.
Using sed:
sed '/^ *#/b;s/^[^|]*//;s/[^|]*$//' filename
How it works:
If the line begins with a # (with optional spaces before the #), branch to the next cycle (i.e. don't do anything)
Remove anything from the beginning of the line to |
Remove anything from the end of the line before |
If you are using BSD sed, split it up:
sed -e '/^ *#/b' -e 's/^[^|]*//;s/[^|]*$//;' filename