Report test summary of julia package from Travis to my README - julia

I have a Julia project on GitHub which uses Travis for CI tests. At the end of the Pkg.test(coverage=true) command from .travis.yml there is a test summary for all my tests inside the job log, looking like this:
Test Summary: | Pass Error Total
All Tests | 11 6 19
Simple Tests | 3 3
Other Tests | 8 6 14
Some specific test | 2 2 4
1.15 | 2 2 4
5.0 | 2 2
More tests | 6 6
Is there a way to report the number of passing/failing tests to my README.md? Maybe similar to the badges included with [![Build Status](https://travis-ci.com/User/ProjectName.svg?branch=master)](https://travis-ci.com/User/ProjectName)
The basic idea is, that my tests will probably still error for the near future, but the number of errors should reduce and I want it to be visible on test reports.

Related

Interpreting t 'arulesSequences' Results in R

I used a 'arulesSequences' in R to analyze the user's sequence of actions.
And I got results like this
'#' | sequence | support
1 | <{A,B}> | 0.87
2 | <{A},{B}> | 0.68
They seem to be the same sequence. What's the difference?
Does this mean that session has changed?
The sequence 1 means that A and B happened together at the same time. The sequence 2 means that first A happened, then B happened

CART Methodology for data with mutually exhaustive rows

I am trying to use CART to analyse a data set whose each row is a segment, for example
Segment_ID | Attribute_1 | Attribute_2 | Attribute_3 | Attribute_4 | Target
1 2 3 100 3 0.1
2 0 6 150 5 0.3
3 0 3 200 6 0.56
4 1 4 103 4 0.23
Each segment has a certain population from the base data (irrelevant to my final use).
I want to condense, for example in the above case, the 4 segments into 2 big segments, based on the 4 attributes and on the target variable. I am currently dealing with 15k segments and want only 10 segments with each of the final segment based on target and also having a sensible attribute distribution.
Now, pardon my if I am wrong but CHAID on SPSS (if not using autogrow) will generally split the data into 70:30 ratio where it builds the tree on 70% of the data and tests on the remaining 30%. I can't use this approach since I need all my segments in the data to be included. I essentially want to club these segments into a a few big segments as explained before. My question is whether I can use CART (rpart in R) for the same. There is an explicit option 'subset' in the rpart function in R but I am not sure whether not mentioning it will ensure CART utilizing 100% of my data. I am relatively new to R and hence a very basic question.

Stargazer: omit and omit.yes.no won't work

I tried to use the omit and the omit.yes.no options in stargazer() to omit a dummy variable. It seems that there is a bug involving this option.
These are what I expect get from the output from stargazer.
logit_1 logit_2 logit_3
| covariates 1 | 21*** 20 *** 21.4***
(0.2) (0.12) (0.10)
| covariate 2 | 0.5 0.3*** 0.31***
(0.4) (0.13) (0.15)
| factor(covariate 3) A | 0.123*** 0.3***
(0.06) (0.08)
| factor(covariate 3) B | 1.5** 1.03***
(O.78) (0.073)
| OM | No No Yes
My stargazer command is the following;
stargazer (logit_1,logit_2,logit_3, omit='OM', omit.labels="OM", omit.yes.no = c("Yes","No")).
When I run the previous command the results of the OM variable is No Yes Yes.
When I run
stargazer (logit_1,logit_2, omit='OM', omit.labels="OM", omit.yes.no = c("Yes","No"))
I get No No.
And when I run
stargazer (logit_2,logit_3, omit='OM', omit.labels="OM", omit.yes.no = c("Yes","No"))
I get Yes Yes.
Yes, this is a known bug that will be removed in the next release. For now, you can apply the following fix:
On line 3956 of stargazer-internal.R, please replace:
if (!is.na(.global.coefficients[k,i])) {
by the following:
if (!is.na(.global.coefficients[.global.coefficient.variables[k],i]))
Then, install again from source. You can also e-mail the package's author for a working version of stargazer that corrects this issue.

Friedman test unreplicated complete block design error

I'm having trouble running a Friedman test over my data.
I'm trying to run a Friedman test using this command:
friedman.test(mean ~ isi | expId, data=monoSum)
On the following database (https://www.dropbox.com/s/2ox0y1b4gwld0ai/monoSum.csv):
> monoSum
expId isi N mean
1 m80B1 1 10 100.000000
2 m80B1 2 10 73.999819
3 m80B1 3 10 45.219362
4 m80B1 4 10 116.566174
. . . . .
18 m80L2 2 10 82.945491
19 m80L2 3 10 57.675480
20 m80L2 4 10 207.169277
. . . . . .
25 m80M2 1 10 100.000000
26 m80M2 2 10 49.752687
27 m80M2 3 10 19.042592
28 m80M2 4 10 150.411035
It gives me back the error:
Error in friedman.test.default(c(100, 73.9998193095267, 45.2193621626293, :
not an unreplicated complete block design
I figure it gives the error because, when monoSum$isi==1 the value of mean is always 100. Is this correct?
However, monoSum$isi==1 is alway 100 because it is the control group on which all the other monoSum$isi groups are normalized. I can not assume a normal distribution, so I cannot run a rmANOVA…
Is there a way to run a friedman test on this data or am I missing a very essential point here?
Many thanks in advance!
I don't get an error if I run your dataset:
Friedman rank sum test
data: mean and isi and expId
Friedman chi-squared = 17.9143, df = 3, p-value = 0.0004581
However, you have to make sure that expId and isi are coded as factors. Run these commands:
monoSum$expID$<-factor(monoSum$expID)
monoSum$isi$<-factor(monoSum$isi)
Then run the test again. This has worked for me with a similar problem.
I know this is pretty old but for future generations (see also: me when I forget and google this again):
You can determine what the missing values are in your dataframe by running table(groups, blocks) or in the case of this question table(monoSum$isi, monoSum$expID). This will return a table of 0s and 1s. This missing records are in the the cells with 0s.
I ran into this problem after trying to remove the blocks that had incomplete results; taking a subset of the data did not remove the blocks for some reason.
Just thought I would mention I found this post because I was getting a similar error message. The above suggestions did not solve it. Strangely, I had to sort my dataframe so that block by block the groups appeared in order (i.e. I could not have the following:
Block 1 A
Block 1 B
Block 2 B
Block 2 A
It has to appear as A, B, A, B)
I ran into the same cryptic error message in R, though in my case it was resolved when I applied the 'as.matrix' function to what was originally a dataframe for the CSV file I imported in using the read.csv() function.
I also had a missing data point in my original data set, and I found that when my data was transformed into a matrix for the friedman.test() call, the entire row containing the missing data point was omitted automatically.
Using the function as.matrix() to transform my dataframe is the magic that got the function to run for me.
I had this exact error too with my dataset.
It turns out that the function friedman.test() accepts data frames (fx those created by data.frame() ) but not tibbles (those created by dplyr and other modern tools). The solution for me was to convert my dataset to a dataframe first.
D_fri <- D_all %>% dplyr::select(FrustrationEpisode, Condition, Participant)
D_fri <- as.data.frame(D_fri)
str(D_fri) # confirm the object should now be a 'data.frame'
friedman.test(FrustrationEpisode ~ Condition | Participant, D_fri)
I ran into this problem too. Fixed mine by removing the NAs.
# My data (called layers) looks like:
| resp.no | av.l.all | av.baseem | av.base |
| 1 | 1.5 | 1.3 | 2.3 |
| 2 | 1.4 | 3.2 | 1.4 |
| 3 | 2.5 | 2.8 | 2.9 |
...
| 1088 | 3.6 | 1.1 | 3.3 |
# Remove NAs
layers1 <- na.omit(layers)
# Re-organise data so the scores are stacked, and a column added with the original column name as a factor
layers2 <- layers1 %>%
gather(key = "layertype", value = "score", av.l.all, av.baseem, av.base) %>%
convert_as_factor(resp.no, layertype)
# Data now looks like this
| resp.no | layertype | score |
| 1 | av.l.all | 1.5 |
| 1 | av.baseem | 1.3 |
| 1 | av.base | 2.3 |
| 2 | av.l.all | 1.4 |
...
| 1088 | av.base | 3.3 |
# Then do Friedman test
friedman.test(score ~ layertype | resp.no, data = layers2)
Just want to share what my problem was. My ID factor did not have correct levels after doing pivot_longer(). Because of this, the same error was given. I made sure the correct level and it worked by the following:as.factor(as.character(df$ID))
Reviving an old thread with new information. I ran into a similar problem after removing NAs. My group and block were factors before the NA removal. However, after removing NAs, the factors retained the levels before the removal even though some levels were no longer in the data!
Running the friedman.test() with the as.matrix() trick (e.g., friedman.test(a ~ b | c, as.matrix(df))) was fine but running frdAllPairsExactTest() or friedman_effsize() would throw the not an unreplicated complete block design error. I ended up re-factoring the group and block (i.e., dropping the levels that were no longer in the data, df$block <- factor(df$block)) to make things work. After the re-factor, I did not need the as.matrix() trick, either.

%Rpush >> lists of complex objects (e.g. pandas DataFrames in IPython Notebook)

Once again, I am having a great time with Notebook and the emerging rmagic infrastructure, but I have another question about the bridge between the two. Currently I am attempting to pass several subsets of a pandas DataFrame to R for visualization with ggplot2. Just to be clear upfront, I know that I could pass the entire DataFrame and perform additional subsetting in R. My preference, however, is to leverage the data management capability of Python and the subset-wise operations I am performing are just easier and faster using pandas than the equivalent operations in R. So for the sake of efficiency and morbid curiosity...
I have been trying to figure out if there is a way to push several objects at once. The wrinkle is that sometimes I don't know in advance how many items will need to be pushed. To retain flexibility, I have been populating dictionaries with DataFrames throughout the front end of the script. The following code provides a reasonable facsimile of what I am working through (I have not converted via com.convert_to_r_dataframe for simplicity, but my real code does take this step):
import pandas as pd
from pandas import DataFrame
%load_ext rmagic
d1=DataFrame(np.arange(16).reshape(4,4))
d2=DataFrame(np.arange(20).reshape(5,4))
d_list=[d1,d2]
names=['n1','n2']
d_dict=dict(zip(names,d_list))
for name in d_dict.keys():
exec '%s=d_dict[name]' % name
%Rpush n1
As can be seen, I can assign a static name and push the DataFrame into the R namespace individually (as well as in a 'list' >> %Rpush n1 n2). What I cannot do is something like the following:
for name in d_dict.keys():
%Rpush d_dict[name]
That snippet raises an exception >> KeyError: u'd_dict[name]'. I also tried to deposit the dynamically named DataFrames in a list, the list references end up pointing to the data rather than the object reference:
df_list=[]
for name in d_dict.keys():
exec '%s=d_dict[name]' % name
exec 'df_list.append(%s)' % name
print df_list
for df in df_list:
%Rpush df
[ 0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15,
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19]
%Rpush did not throw an exception when I looped through the lists contents, but the DataFrames could not be found in the R namespace. I have not been able to find much discussion of this topic beyond talk about the conversion of lists to R vectors. Any help would be greatly appreciated!
Rmagic's push uses the name that you give it both to look up the Python variable, and to name the R variable it creates. So it needs a valid name, not just any expression, on both sides.
There's a trick you can do to get the name from a Python variable:
d1=DataFrame(np.arange(16).reshape(4,4))
name = 'd1'
%Rpush {name}
# equivalent to %Rpush d1
But if you want to do more advanced things, it's best to get hold of the r object and use that to put your objects in. Rmagic is just a convenience wrapper over rpy2, which is a full API. So you can do:
from rpy2.robjects import r
r.assign('a', 1)
You can mix and match which interface you use - rmagic and rpy2 are talking to the same instance of R.

Resources