XQuery - sum of values of element value in unbounded structure

XQuery - sum of values of element value in unbounded structure - xquery

I am trying to get a sum of values of specific field from the below structure but look like its not working as I am getting error as expected zero or one value but got two or more.
<v4:CalculateResponse xmlns:v4="http://services.xx.net/mm/va">
<v4:CalculateResponseSizeType>
<v4:CalculateCCs>
<v4:Container>
<v4:GrossBookedWeight>31.6</v4:GrossBookedWeight>
<v4:NetPredictedWeight>50</v4:NetPredictedWeight>
<v4:GrossPredictedWeight>53.6</v4:GrossPredictedWeight>
<v4:TypeOfWeightUsed>P</v4:TypeOfWeightUsed>
</v4:Container>
<v4:Container>
<v4:GrossBookedWeight>31.6</v4:GrossBookedWeight>
<v4:NetPredictedWeight>50</v4:NetPredictedWeight>
<v4:GrossPredictedWeight>53.6</v4:GrossPredictedWeight>
<v4:TypeOfWeightUsed>B</v4:TypeOfWeightUsed>
</v4:Container>
<v4:Container>
<v4:GrossBookedWeight>31.6</v4:GrossBookedWeight>
<v4:NetPredictedWeight>50</v4:NetPredictedWeight>
<v4:GrossPredictedWeight>53.6</v4:GrossPredictedWeight>
<v4:TypeOfWeightUsed>B</v4:TypeOfWeightUsed>
</v4:Container>
<v4:Container>
<v4:GrossBookedWeight>31.6</v4:GrossBookedWeight>
<v4:NetPredictedWeight>50</v4:NetPredictedWeight>
<v4:GrossPredictedWeight>53.6</v4:GrossPredictedWeight>
<v4:TypeOfWeightUsed>P</v4:TypeOfWeightUsed>
</v4:Container>
</v4:CalculateCCs>
</v4:CalculateResponseSizeType>
<v4:Status>P</v4:Status>
<v4:StatusCode>1000</v4:StatusCode>
</v4:CalculateResponse>
I have tried summing these values using below function but look like its only onpecting one value.
<Weight>
{
sum(
data($calculateResponse1/*:CalculateResponseSizeType/*:CalculateCCs/*:Container[data(*:TypeOfWeightUsed) = "B"]/*:GrossBookedWeight),
data($calculateResponse1/*:CalculateResponseSizeType/*:CalculateCCs/*:Container[data(*:TypeOfWeightUsed) = "P"]/*:GrossPredictedWeight)
)
}
</Weight>
here calculation is simple, say if TypeOfWeightUsed = 0 then I want to use GrossPredictedWeight element value or if TypeOfWeightUsed = B then I want to use GrossBookedWeight.
we can have multiple container in a structure.
Pls suggest what is wrong with above syntex.

here calculation is simple, say if TypeOfWeightUsed = 0 then I want to use GrossPredictedWeight element value or if TypeOfWeightUsed = B then I want to use GrossBookedWeight.
You can use FLOWR expression with the help of if else construct to get all numbers needed for doing the sum() :
<Weight>
{
sum(
for $c in $calculateResponse1/*:CalculateResponseSizeType/*:CalculateCCs/*:Container
return
if($c/*:TypeOfWeightUsed = "B") then $c/*:GrossBookedWeight
else $c/*:GrossPredictedWeight
)
}
</Weight>
demo
output :
<Weight>170.4</Weight>

When the sum() function has two arguments, the second argument provides a value to be used as the result when the first argument is an empty sequence. (This is a clumsy way of dealing with the fact that without static type checking, the sum() function cannot distinguish an empty sequence of doubles from an empty sequence of durations, and you don't really want an integer-zero result when you are summing durations).
You have called the function with two arguments, but I think you want both sequences to be regarded as inputs to be summed. Just add another pair of parentheses to make it a single argument: replace sum(x, y) by sum((x, y)).
The reason you got an error is that the second argument, if supplied, must be a singleton value, not a sequence.

Related

loop through vector in R to delete objects containing pattern above certain limit

Trying to figure out how to loop through a vector and eliminate components containing a particular pattern above a predetermined limit. For example, in the following vector, I might want to keep just the first two instances of both the "a_a_" and "b_b_" components.
x <- c("a_a_a", "a_a_b", "a_a_c", "a_a_d", "b_b_a", "b_b_b", "b_b_c", "b_b_d")
The resulting vector, after the loop deleting extraneous components, would be like this:
x = "a_a_a", "a_a_b", "b_b_a", "b_b_b"
The tricky part is that the code must first detect what is contained in the pattern, then loop through the (extremely long) vector to find all matching patterns, and establish a means of counting instances so that once it hits that given level, it then eliminates all matching components thereafter.
Any help is greatly appreciated.

You can use grep to identify which elements have the patterns and keep only the first two.
patterns = c("a_a", "b_b")
keep = NULL
for(p in patterns) { keep = c(keep, grep(p,x)[1:2]) }
x = x[keep]
x
[1] "a_a_a" "a_a_b" "b_b_a" "b_b_b"

R add to a list in a loop, using conditions

I have a data.frame dim = (200,500)
I want to do a shaprio.test on each column of my dataframe and append to a list. This is what I'm trying:
colstoremove <- list();
for (i in range(dim(I.df.nocov)[2])) {
x <- shapiro.test(I.df.nocov[1:200,i])
colstoremove[[i]] <- x[2]
}
However this is failing. Some pointers? (background is mainly python, not much of an R user)

Consider lapply() as any data frame passed into it runs operations on columns and the returned list will be equal to number of columns:
colstoremove <- lapply(I.df.noconv, function(col) shapiro.test(col)[2])

Here is what happens in
for (i in range(dim(I.df.nocov)[2]))
For the sake of example, I assume that I.df.nocov contains 100 rows and 5 columns.
dim(I.df.nocov) is the vector of I.df.nocov dimensions, i.e. c(100, 5)
dim(I.df.nocov)[2] is the 2nd dimension of I.df.nocov, i.e. 5
range(x)is a 2-element vector which contains minimal and maximal values of x. For example, range(c(4,10,1)) is c(1,10). So range(dim(I.df.nocov)[2]) is c(5,5).
Therefore, the loop iterate twice: first time with i=5, and second time also with i=5. Not surprising that it fails!
The problem is that R's function range and Python's function with the same name do completely different things. The equivalent of Python's range is called seq. For example, seq(5)=c(1,2,3,4,5), while seq(3,5)=c(3,4,5), and seq(1,10,2)=c(1,3,5,7,9). You may also write 1:n, it is the same as seq(n), and m:n is same as seq(m,n) (but the priority of ':' is very high, so 1:2*x is interpreted as (1:2)*x.
Generally, if something does not work in R, you should print the subexpressions from the innerwise to the outerwise. If some subexpression is too big to be printed, use str(x) (str means "structure"). And never assume that functions in Python and R are same! If there is a function with same name, it usually does a different thing.
On a side note, instead of dim(I.df.nocov)[2] you could just write ncol(I.df.nocov) (there is also a function nrow).

Excel vlookup in Julia

I have two arrays in Julia, X = Array{Float64,2} and Y = Array{Float64,2}. I'd like to perform a vlookup as per Excel functionality. I can't seem to find something like this.

the following code returns first matched from s details matrix using related record from a master matrix.
function vlook(master, detail, val)
val = master[findfirst(x->x==val,master[:,2]),1]
return detail[findfirst(x->x==val,detail[:,1]),2]
end
julia> vlook(a,b,103)
1005
A more general approach is to use DataFrame.jl, for working with tabular data.

VLOOKUP is a popular function amongst Excel users, and has signature:
VLOOKUP(lookup_value,table_array,col_index_num,range_lookup)
I've never much liked that last argument range_lookup. First it's not clear to me what "range_lookup" is intended to mean and second it's an optional argument defaulting to the much-less-likely-to-be-what-you-want value of TRUE for approximate matching, rather than FALSE for exact matching.
So in my attempt to write VLOOKUP equivalents in Julia I've dropped the range_lookup argument and added another argument keycol_index_num to allow for searching of other than the first column of table_array.
WARNING
I'm very new new to Julia, so there may be some howlers in the code below. But it seems to work for me. Julia 0.6.4. Also, and as already commented, using DataFrames might be a better solution for looking up values in an array-like structure.
#=
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Procedures: vlookup and vlookup_withfailsafe
Purpose : Inspired by Excel VLOOKUP. Searches a column of table_array for
lookup_values and returns the corresponding elements from another column of
table_array.
Arguments:
lookup_values: a value or array of values to be searched for inside
column keycol_index_num of table_array.
table_array: An array with two dimensions.
failsafe: a single value. The return contains this value whenever an element
of lookup_values is not found.
col_index_num: the number of the column of table_array from which values
are returned.
keycol_index_num: the number of the column of table_array to be searched.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
=#
vlookup = function(lookup_values, table_array::AbstractArray, col_index_num::Int = 2, keycol_index_num::Int = 1)
if ndims(table_array)!=2
error("table_array must have 2 dimensions")
end
if isa(lookup_values,AbstractArray)
indexes = indexin(lookup_values,table_array[:,keycol_index_num])
if(any(indexes==0))
error("at least one element of lookup_values not found in column $keycol_index_num of table_array")
end
return(table_array[indexes,col_index_num])
else
index = indexin([lookup_values],table_array[:,keycol_index_num])[1]
if(index==0)
error("lookup_values not found in column $keycol_index_num of table_array")
end
return(table_array[index,col_index_num])
end
end
vlookup_withfailsafe = function(lookup_values, table_array::AbstractArray, failsafe, col_index_num::Int = 2, keycol_index_num::Int = 1)
if ndims(table_array)!=2
error("table_array must have 2 dimensions")
end
if !isa(failsafe,eltype(tablearray))
error("failsafe must be of the same type as the elements of table_array")
end
if isa(lookup_values,AbstractArray)
indexes = indexin(lookup_values,table_array[:,keycol_index_num])
Result = Array{eltype(table_array)}(size(lookup_values))
for i in 1:length(lookup_values)
if(indexes[i]==0)
Result[i] = failsafe
else
Result[i] = table_array[indexes[i],col_index_num]
end
end
return(Result)
else
index = indexin([lookup_values],table_array[:,keycol_index_num])[1]
if index == 0
return(failsafe)
else
return(table_array[index,col_index_num])
end
end
end

how to not use if else to assign value to vectors iterately in R

Say I have a vector defined a= rep(NA, 10);
I want to give its ith element a value for each iteration.
for(i in 1:10){
indexUsed[i] = largestGradient(X, y, indexUsed[is.na(indexUsed)], score)
}
as you see, I want use index[1:(i-1)] to calculate ith element, but for the first element, I want a NULL or whatever, special value there to let my function knows that it is empty (then it will handles this in the case for assigning value to the first element which is different from the next steps).
I do not know my writing is a good way to do that, usually how you do?

I don't have a better way of doing this than with a for loop, but would love to see other people's responses. However, it does seem to me that your code should read
indexUsed[i] <- largestGradient(X, y, indexUsed[!is.na(indexUsed)], score)
For i=1, your indexUsed[!is.na(indexUsed)] will be empty, and should be your based case in your function. For every other iteration, it will retrieve elements 1 through i-1.

Custom function does not work in R 'ddply' function

I am trying to use a custom function inside 'ddply' in order to create a new variable (NormViability) in my data frame, based on values of a pre-existing variable (CelltiterGLO).
The function is meant to create a rescaled (%) value of 'CelltiterGLO' based on the mean 'CelltiterGLO' values at a specific sub-level of the variable 'Concentration_nM' (0.01).
So if the mean of 'CelltiterGLO' at 'Concentration_nM'==0.01 is set as 100, I want to rescale all other values of 'CelltiterGLO' over the levels of other variables ('CTSC', 'Time_h' and 'ExpType').
The normalization function is the following:
normalize.fun = function(CelltiterGLO) {
idx = Concentration_nM==0.01
jnk = mean(CelltiterGLO[idx], na.rm = T)
out = 100*(CelltiterGLO/jnk)
return(out)
}
and this is the code I try to apply to my dataframe:
library("plyr")
df.bis=ddply(df,
.(CTSC, Time_h, ExpType),
transform,
NormViability = normalize.fun(CelltiterGLO))
The code runs, but when I try to double check (aggregate or tapply) if the mean of 'NormViability' equals '100' at 'Concentration_nM'==0.01, I do not get 100, but different numbers. The fact is that, if I try to subset my df by the two levels of the variable 'ExpType', the code returns the correct numbers on each separated subset. I tried to make 'ExpType' either character or factor but I got similar results. 'ExpType has two levels/values which are "Combinations" and "DoseResponse", respectively. I can't figure out why the code is not working on the entire df, I wonder if this is due to the fact that the two levels of 'ExpType' do not contain the same number of levels for all the other variables, e.g. one of the levels of 'Time_h' is missing for the level "Combinations" of 'ExpType'.
Thanks very much for your help and I apologize in advance if the answer is already present in Stackoverflow and I was not able to find it.
Michele

I (the OP) found out that the function was missing one variable in the arguments, that was used in the statements. Simply adding the variable Concentration_nM to the custom function solved the problem.
THANKS
m.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

XQuery - sum of values of element value in unbounded structure - xquery

Related

loop through vector in R to delete objects containing pattern above certain limit

R add to a list in a loop, using conditions

Excel vlookup in Julia

how to not use if else to assign value to vectors iterately in R

Custom function does not work in R 'ddply' function

Categories

Resources