Ant colony optimization algorithim - automated-tests

Does ACO read data to generate test case and which file formats ACO accepts?
I have been doing some researches that lead confirm that ACO can generate test cases and test data.

Related

How can I load pdb file varying in number of atoms?

I'm trying to visualize droplet simulation.
I deleted water molecules that is far from the system of interest during the simulation.
Therefore, the number of atoms decreases as model proceeds.
However, VMD could not recognize such pdb file, because of the way it working.
I found a way to use multimolanim plugin with separated files, but I have more than 1000 frames...
Is there an other way to visualize such PDB file??

How to run segments of specific R code without running them before?

I'm writing a R script for colleagues to use. They load in their original datas as DF (all with the same name) and then they can run different statistical calculations and analysis on it. My colleges are not used to operating with R. So i want to make it as simple as possible for them. So my idea was to write the statistical calculations and analyses in to different sections of the code. At the beginning of the R script i would tell them to run lines from 10 to 20 to calculate average wage of the people in the DF, run code from lines 20 to 30 calculate correlation between level of education and wages and so on. They would just have to go to the lines mentioned make them active and press run.
In my real life code the calculations are more complex then i wrote here, but this is to give you an example of what i wanted to do.
Then I started to think could I write to so that if they run a singe command then that tells R to go and run codes from 10 to 20. And a different command tell's R to go and run codes on lines 20 to 30. I could not figure out how to do it. I looked at functions but this would require my college to run all of the code first anyway and defeat the purpose of the function in this case.
Is there a way to do it and if so then how?

How to save my trained Random Forest model and apply it to test data files one by one?

This is a long shot and more of a code designing sort of ask for a rookie like me but I think it has real value for real world applications
The core questions are:
Can I save a trained ML model, such as Random Forest (RF), in R and call/use it later without the need to reload all the data used for training it?
When, in real life, I have a massive folder of hundreds and thousands files of data to be tested, can I load that model I saved somewhere in R and ask it to go read the unknown files one by one (so I am not limited by RAM size) and perform regression/classification etc analysis for each of the file read in, and store ALL the output together into a file.
For example,
If I have 100,000 csv files of data in a folder, and I want to use 30% of them as training set, and the rest as test for a Random Forest (RF) classification.
I can select the files of interest, call them "control files". Then use fread() then randomly sample 50% of the data in those files, call the CARET library or RandomForest library, train my "model"
model <- train(,x,y,data,method="rf")
Now can I save the model somewhere? So I don't have to load all the control files each time I want to use the model?
Then I want to apply this model to all the remaining csv files in the folder, and I want it to read those csv files one by one when applying the model, instead of reading them all in, due to RAM issue.

data mining with unstructured data how to implement?

I have unstructured data (screenshot of app) and semi-structured data(screen dumping file), i chose store it in hbase. my goal is find defect or issue on app (meaningfull data). Now, I'd like to apply data mining on these, so that is kind of text mining ? and how can i apply some data mining technical on this data ?
To begin with, you can use rule based approach where you define set of rules which detects the defect scenario.
Then you can prepare training data set which has many instances of defect, non-defect scenarios. In this step, for each screenshot or screen dump file you collect; you would manually tag it as defect or non-defect.
Then you can train classifier using this training data. Classifier would try to generalize training samples to predict the output label for the samples not seen in the past.
Since, your input is non-standard you might need some preprocessing to convert your input to standard form. For example, to process screenshots you might need some image processing, OCR, computer vision libraries.

How to export an R Random Forest model for use in Excel VBA without API calls

Problem:
I have a Random Forest model trained in R. I need to deploy this model in a standalone Excel tool that will be used by 350 people across a sales network to perform real-time predictions based on data entered into the spreadsheet by users.
How can I do this?
Constraints:
It is not an option to require users to install R on their local machines.
It is not an option to have a server (physical or cloud) providing a scoring API.
What have I done so far?
1. PMML
I can export the model in PMML (XML structure). From research I can see there are libraries for loading and executing PMML inputs in Python and Java. However I haven't found anything implemented in VBA / VB.
2. Zementis
I looked into a solution called Zementis which offers an Excel add-in to deploy PMML models. However from my understanding this requires web-service calls to a cloud server (e.g. AWS) where the actual model execution happens. My IT security department will not allow this.
3. Others
The most common recommendation seems to be to call R to load the model and run the predict function. As noted above, this is not a viable option.
Detailed Context:
The Random Forest model is trained in R, with c. 30 variables. The model is used to recommend "personalised" prices for products as part of a sales process.
The model needs to be distributed to the sales network, with about 350 users. The business's preference is to integrate the model into an existing spreadsheet tool that sales teams currently use to calculate deal profitability.
This means that I need to be able to export the model in a way that it can be implemented in Excel VBA.
Given timescales, the implementation needs to be self-contained with no IT infrastructure or additional application installs. We are working with the organisation's IT team on a server based solution, however their deployment timescales are 12 months+ which means we need a tactical solution in the short-term.
Here's one approach to get the "rules" for the trees (example using the mtcars dataset)
install.packages("randomForest")
library(randomForest)
head(mtcars)
set.seed(1)
fit <- randomForest(mpg ~ ., data=mtcars, importance=TRUE, proximity=TRUE)
print(fit)
## Look at variable importance:
importance(fit)
# Print the rules for each tree in the forest
install.packages("rattle")
library(rattle)
printRandomForests(fit)
It is probably unrealistic to use the rules for 500 trees, but maybe you could implement 100 trees in your vba and then take an average of the results (for a continuous response) or predict the class with the most votes across the trees (for a categorical response).
Maybe you could recreate the model on a Worksheet.
As far as I know, Excel can import XML structures (on the Development Tools ribbon).
Edit: 1) save pmml structure in plaintext editor as .xml file.
2) Open the file in Excel 2013 (maybe other versions also do it)
3) Click through the error message and open the file anyway. Trees open as a table, a bit funny, but recognizable.
4) Create prediction calculation (generic fn in VBA) to operate on the tree.

Resources