I am working with cirq and use a random unitary for testing purposes with:
random_matrix = cirq.testing.random_unitary(dim=4)
where can can the random seed for this function be set, using random.seed(a=1) does not seem to do this.
Cirq relies on numpy for its random functions, so using:
np.random.seed(2)
Sets the seed for cirq
Related
I'm using the R-package randomForest version 4.6-14. The function randomForest takes a parameter localImp and if that parameter is set to true the function computes local explanations for the predictions. However, these explanations are for the provided training set. I want to fit a random forest model on a training set and use that model to compute local explanations for a separate test set. As far as I can tell the predict.randomForest function in the same package provides no such functionality. Any ideas?
Can you explain more about what it means to have some local explanation on a test set?
According to this answer along with the package document, the variable importance (or, the casewise importance implied by localImp) evaluates how the variable may affect the prediction accuracy. On the other hand, for the test set where there is no label to assess the prediction accuracy, the variable importance should be unavailable.
When I read An Introduction To Statistical Learning, I am puzzled by the following passage:
We set a random seed before we apply knn() because if several
observations are tied as nearest neighbors, then R will randomly break
the tie. Therefore, a seed must be set in order to ensure
reproducibility of results.
Could anyone please tell me why is the result of KNN random?
The reason behind that if we use set.seed() before knn() in R then it helps to select only one random number because if we run knn() then random numbers are generated but if we want that the numbers do not change then we can use it.
I am using Rstudio and I created a random data like this:
n<-500
u<-runif(n)
This data is now stored but obviously once I run the code again it will change. How could I store it to use it again? If the number of points was small I would just define a vector and manually write the numbers like
DATA<-c(1,2,3,4)
But obviously doing this for 500 points is not very practical. Thank you.
In such cases, i.e. when using pseudo random number generators, a common approach is to set the seed:
set.seed(12345)
You have to store the seed that you used for the simulation, so that in future you's set the same seed and get the same sequence of numbers. The seed indicates that the numbers are not truly random, they're pesudo random. The same seed will generate the same numbers. There are services such as RANDOM which attempt to generate true random numbers.
How do I perform knn cross validation with input data that has been clustered using k-means.
I seem to be unable to find the correct function which is able to do so.
predict.strengt from fpcseem to be able to compute some form of prediction rate given a classifier method, but it seems it test it against the training set, which in my mind doesn't seem that beneficial.
Aren't there any function which can perform cross validation?
Example:
library("datasets")
library("stats")
iris_c3 = kmeans(iris$Sepal.Length,center= 10, iter.max = 30)
How do I provide iris_c3 as training data for some form of knn which also performs cv, if a given test set was provided in the same manner.
I am working on a random forest in R and I would like to add the 10- folds cross validation to my model. But I am quite stuck there.
This is sample of my code.
install.packages('randomForest')
library(randomForest)
set.seed(123)
fit <- randomForest(as.factor(sickrabbit) ~ Feature1,..., FeatureN ,data=training1, importance=TRUE,sampsize = c(200,300),ntree=500)
I found online the function rfcv in caret but I am not sure to understand how it works. Can anyone help with this function or propose an easier way to implement cross validation. Can you do it using random forest package instead of caret?
You don't need to cross-validate a random forest model. You are getting stuck with the randomForest package because it wasn't designed to do this.
Here is a snippet from Breiman's official documentation:
In random forests, there is no need for cross-validation or a separate test set to get an unbiased estimate of the test set error. It is estimated internally, during the run, as follows:
Each tree is constructed using a different bootstrap sample from the original data. About one-third of the cases are left out of the bootstrap sample and not used in the construction of the kth tree.
Put each case left out in the construction of the kth tree down the kth tree to get a classification. In this way, a test set classification is obtained for each case in about one-third of the trees. At the end of the run, take j to be the class that got most of the votes every time case n was oob. The proportion of times that j is not equal to the true class of n averaged over all cases is the oob error estimate. This has proven to be unbiased in many tests.