walk through all components in model - openmdao

I have a fairly large model with components grouped hierarchically about 3 levels deep. It would be useful for me to be able recursively iterate through my components and list inputs and outputs, as well as all the option values, and format all that data to my liking so I can make a nice report with it.
calling list_inputs() and list_outputs() on a given group sort of does what I want, in that it prints off the inputs and outputs, but if you call it on a large group you can't get the inputs and outputs of single component next to each other on the page.
I could probably reverse engineer how list_inputs() is working itself but was wondering if there is an easy way to do it.

As you noted, list_inputs and list_outputs are both methods defined on the System class. Thought these methods do group their print-outs by component, the challenge is that you get all the inputs first, then all the outputs. You can't easily see the inputs and outputs for a single component together.
Both of these methods can have their printing shut off by setting out_stream=None, and each of them returns a list of variable data that you can manually parse through. That may not give you the format you want though.
If you want to manually recurse over the hierarchy and write your own custom report method, then you should look at the following methods on System (i.e. components and groups):
get_io_metadata
system_iter
Those, combined with the data returned from list_inputs and list_outputs should give you what you need.

Related

Extracting Nested Elements of an R List Generated by Loops

For lists within lists produced by a loop in R (in this example a list of caret models) I get an object with an unpredictable length and names for inner elements, such as list[[1]][[n repeats of 1]][[2]] where the internal [[1]] is repeated multiple times according to the function's input. In some cases, the length of n is not known, when accessing some older stored lists where input was not saved. While there are ways to work within a list index, like with list[length(list)], there appears to be no way to do this with repeated nested elements. This has made accessing them and passing them to various jobs awkward. I assume there is an efficient way to access them that I have missed, so I'm asking for help to do so, with an example case given below.
The function I'm generating gives out a list from a function that creates several outputs. The final list returned for a function having a complicated output structure is produced by returning something like:
return(list(listOfModels, trainingData, testingData))
The listofModels has variable length, depending on input of models given, and potentially other conditions depend on evaluation inside the function. It is made by:
listOfModels <- list(c(listOfModels, list(trainedModel)))
Where the "trainedModel" refers to the most recently trained model generated in the loop. The models used and the number of them may vary each time depending on choice. An unfortunate result is a complicated nested lists within a list.
That is, output[[1]] contains the models I want to access more efficiently, which are themselves list objects, while output[[2]] and output[[3]] are the dataframes used to train and evaluate the models. While accessing the dataframes is simple and has a defined, reproducible structure each time (simply being output[[2]], output[[3]] every time), output[[1]] becomes a mess. E.g., something like the following follows the "output[[1]]":
The only thing I am able to attempt in order to access this is using the fact that [[1]] is attached upon output[[1]] before [[2]]. All of the nested elements except one have a [[2]] at the end. Given the above pattern, there is an ugly solution that works, but is not a desirable format to work with. E.g., after evaluating n models given by a vector of strings called inputList, and a list given as output of the function, "output", I can have [[1]] repeated tens to hundreds of times.
for (i in (1:length(inputList)-1)){
eval(rlang::parse_expr(paste0(c("output", c(rep("[[1]]", 1+i)), "[[2]]" ) , collapse="")) )
}
This could be used to use all models for some downstream task like making predictions on new data, or whatever. In cases where the length of the inputList was not known, this could be found out by attempting to repeat this until finding an error, or something similar. This approach can be modified to call on a specific part of the list, for example, a certain model within inputList, if I know the original list input and can find the number for that model. Besides the bulkiness code working this way, compared to some way where I could just call on output[[1]][[n]] using some predictable format for various length n. One of the big problems is when accessing older runs that have been saved where the input list of models was not saved, leaving the length of n unknown. I don't know of any way of using something like length() or lengths() to count how many nested elements exist within a list. (For my example, output[[1]] is of length 1, no matter how many [[1]] repeat elements there are.)
I believe the simplest solution is to change the way the list is saved by the function, so that I can access it by a systematic reference, however, I have a bunch of old lists which I still want to access and perform some work with, and I'd also like to be able to have better control of working with lists in any case. So any help would be greatly appreciated.
I expected there would be some way to query the structure of nested R lists, which could be used to pass nested elements to separate functions, without having to use very long repetition of brackets.

Properly define the members and invariants of a class in R

We have an R package for a certain purpose. The basic data structure is a correlation function which is a real/complex valued function for a smallish (100) number (T) of time slices. We have multiple measurements (N) of it, so at its core it is a N×T matrix. But then there are more things that it can become:
One can bootstrap it with R samples such that it becomes an R×T matrix. However we want to keep the original data, so there is a field for the R×T matrix and another for the N×T matrix.
It can be symmetrized which will cut T in half and also alter various other functions that work with those objects.
Also it can be shifted which takes the difference between consecutive elements and therefore drops one time slice. The first column in the matrix then corresponds to t = 1 and not t = 0 any more, which becomes important in fits to the data.
Correlation functions may have an imaginary part, this is stored as a second real matrix. But they might not.
When doing non-linear operations with the data, we do that once with the average of the original data and each bootstrap sample. If the result is another correlation function, that object will not have “original data” but only the average.
So basically we have a class that can have various fields and only the average of the original data is really common.
To make things worse, there is no formal documentation for the possible members and the invariants associated with them. Coming from C++ where a concise class definition allows me do encapsulation, The S3 class system in R seems like an invitation for inconsistencies.
This surfaced a few times when some function taking such a correlation function as argument and expected some field to be present while it was not. The code is riddled with lines that just add another field to the class when performing an operation.
Long story short: Is there some automatically enforcable way in the S3 class system to have an exhaustive list of all the fields that a class can have? Right now I only see the possibility to document (in English) in the constructor function and just hope nobody missed a line where fields were added.

How to delete all "Values" in RStudio Environment?

I know that rm(list=ls()) can delete all objects in the current environment.
However, environment has three categories: Data, Values, Functions. I wonder how I can only delete all the objects in one particular category? Something like
rm(list=ls(type="Values"))
You could use ls.str to specify a mode, or lsf.str for functions. The functions have print methods that make it look otherwise, but underneath are just vectors of object names, so
rm(list = lsf.str())
will remove all user-defined functions, and
rm(list = ls.str(mode = 'numeric'))
will remove all numeric vectors (including matrices). mode doesn't correspond exactly to class, though, so there's no way to distinguish between lists and data.frames with this method.
One option is that you can change the view to grid view and check all the boxes next to the ones you want to delete and click the broom button.
So far as I'm aware, Data, Values and Functions are terms used by the RStudio interface. Data = variables with dimensions e.g. data frames, matrices, Values = other variables (e.g. vectors). They are not terms that can be accessed via R code.

How to get R to use a certain dataset for multiple commands without usin attach() or appending data="" to every command

So I'm trying to manipulate a simple Qualtrics CSV, and I want to use colSums on certain columns of data, given a certain filter.
For example: within the .csv file called data, I want to get the sum of a few columns, and print them with certain labels (say choice1, choice2 etc). That is easy enough by itself:
firstqn<-data.frame(choice1=data$Q7_2,choice2=data$Q7_3,choice3=data$Q7_4);
secondqn<-data.frame(choice1=data$Q8_6,choice2=data$Q8_7,choice3=data$Q8_8)
print colSums(firstqn); print colSums(secondqn)
The problem comes when I want to repeat the above steps with different filters, - say, only the rows where gender==2.
The only way I know how is to create a new dataset data2 and replace data$ with data2$ in every line of the above code, such as:
data2<-(data[data$Q2==2,])
firstqn<-data.frame(choice1=data2$Q7_2,choice2=data2$Q7_3,choice3=data2$Q7_4);
however i have 6 choices for each of 5 questions and am planning to apply about 5-10 different filters, and I don't relish the thought of copy/pasting data2 and `data3' etc hundreds of times.
So my question is: Is there any way of getting R to reference data by default without using data$ in front of every variable name?
I can probably use attach() to achieve this, but i really don't want to:
data2<-(data[data$Q2==2,])
attach(data2)
firstqn<-data.frame(choice1=Q7_2,choice2=Q7_3,choice3=Q7_4);
detach(data2)
is there a command like attach() that would allow me to avoid using data$ in front of every variable, for a specified amount of code? Then whenever I wanted to create a new filter, I could just copy/paste the same code and change the first command (defining a new dataset).
I guess I'm looking for some command like with(data2, *insert multiple commands here*)
Alternatively, if anyone has a better way to do the above in an entirely different way please enlighten me - i'm not very proficient at R (yet).

arcmap network analyst iteration over multiple files using model builder

I have 10+ files that I want to add to ArcMap then do some spatial analysis in an automated fashion. The files are in csv format which are located in one folder and named in order as "TTS11_path_points_1" to "TTS11_path_points_13". The steps are as follows:
Make XY event layer
Export the XY table to a point shapefile using the feature class to feature class tool
Project the shapefiles
Snap the points to another line shapfile
Make a Route layer - network analyst
Add locations to stops using the output of step 4
Solve to get routes between points based on a RouteName field
I tried to attach a snapshot of the model builder to show the steps visually but I don't have enough points to do so.
I have two problems:
How do I iterate this procedure over the number of files that I have?
How to make sure that every time the output has a different name so it doesn't overwrite the one form the previous iteration?
Your help is much appreciated.
Once you're satisfied with the way the model works on a single input CSV, you can batch the operation 10+ times, manually adjusting the input/output files. This easily addresses your second problem, since you're controlling the output name.
You can use an iterator in your ModelBuilder model -- specifically, Iterate Files. The iterator would be the first input to the model, and has two outputs: File (which you link to other tools), and Name. The latter is a variable which you can use in other tools to control their output -- for example, you can set the final output to C:\temp\out%Name% instead of just C:\temp\output. This can be a little trickier, but once it's in place it tends to work well.
For future reference, gis.stackexchange.com is likely to get you a faster response.

Resources