I was hoping someone might be able to help me make sense of a homework question. I am not looking for a solution, mind you, just wondering if anyone would be able to explain the question a bit more simply for me, as I am new to data analysis and enrolled in an R class which had no prerequisites, but feel a bit lost with some of the language. Any help would be greatly appreciated!
So, the first part of the question was to create an array and fill it with random numeric data, which I did here:
question <- array( 1:1000, dim= c(25,4,1000))
colnames(question)<- c('x1','x2','x3','x4')
Now, the second part asks me to "write a function to create y-values," which should be a "linear combination" of the four variables. The example given is
y = 2 ∗ x1 + 5 ∗ x2 − 3 ∗ x3 + 0.7 ∗ x4 + RandomError.
The question adds that the result should be a matrix with dimensions of 25 × 1000. I am not sure what exactly this is asking or how to approach this problem. All I have so far, which I know is very little is
apply(question,c(1,3),sum)
function (y){ ...
Can anyone offer any guidance or clarification? Thank you so much!
First of all, to make (pseudo)random numbers, you can use the rnorm function. That is, if you want to make 1000 random numbers that are normally distributed with mean of 0 and sd of 1, you can do rnorm(1000) (However, your array ends up being length 10000, so maybe you actually want to do rnorm(10000)).
Now, you should have an array question with dimensions 25 x 4 x 1000. You want to create a matrix y which combines four "slices" in question of size 25 x 1000 to create a matrix y of size 25 x 1000. You want to write a function f that will take all four "slices" of array question and combine them into one slice. You also want to incorporate random error, which again can be accomplished with the rnorm function.
For a simple example, let's make an array x with dimensions(10,2,10)
x = array(rnorm(200), dim = c(10,2,10))
And now let's write a function f that will add the two "slices" of x together.
f = function(my_array){
my_array[,1,] + my_array[,2,]
}
Let's execute the function on our array
y = f(x)
dim(y)
Hopefully you can expand this basic example to fit your case.
Related
Consider this two‐dimensional random walk:
where, Zt, Wt, t = 1,2,3, … are independent and identically distributed standard normal
random variables.
I am having problems in finding a way to simulate and plot the sample path of (X,Y) for t = 0,1, … ,100. I was given a sample:
The following code is an example of the way I am used to plot random walks in R:
set.seed(13579)
r<-sample(c(-1,1),size=100,replace=T,prob=c(0.5,0.5))
r<-c(10,r))
(w<-cumsum(r))
w<-as.ts(w)
plot(w,main="random walk")
I am not very sure of how to achieve this.
The problem I am having is that this kind of codes has a more "simple" result, with a line that goes either up or down, -1 or +1:
while the plot I need to create also goes from left to right and viceversa.
Would you help me in correcting the code I know so that it fits my task/suggesting a smarterst way to go about it? It would be greatly appreciated.
Cheers!
Instead of using sample, you need to use rnorm(100) to draw 100 samples from a standard normal distribution. Since the walk starts at [0, 0], we need to append a 0 at the start and do a cumsum on the result, i.e. cumsum(c(0, rnorm(100))).
We want to do this for both the x and y variables, then plot. The whole thing can be done in a single line of code in base R:
plot(x = cumsum(c(0, rnorm(100))), y = cumsum(c(0, rnorm(100))), type = 'l')
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
So I've probably referenced the entire internet trying to make this problem work, and haven't. However, I found stack overflow. Like I said I've been learning for not even 2 weeks yet.
So this is the problem
Let
f(x)=sqrt((x^3+3x^2+1)/(x^4+5x^3+7x+9))
(x ≥ 0)
(a) Draw a line graph of (x, f(x)) for 0 ≤ x ≤ 10 with increments of 0.01
(b) Find numerically the maximum value of f(x) and the maximizer x (report x to the
second decimal place. For instance, x = 1.23)
So I'm basically been saying x=x and y= the sqrt....., and then I write plot(x,y,type="l") and usually it just doesn't even work.
Also how do I do the increment part. I'm sorry for lack of explanation, but I have no idea what most of this means.
First thing to do would be to define the function:
equation <- function(x){
sqrt((x^3+3*x^2+1)/(x^4+5*x^3+7*x+9))
}
Then, define the values you want to apply the function to, and store them in vector input
input<-seq(0,10,0.01)
Apply the equation function to input, and store the values in vector results
results<-sapply(input,function)
Produce a line plot:
plot(input,results,type="l")
Print the value of x which maximises f(x)
maxx<-input[which.max(results)]
maxx
I would suggest a ggplot2 approach. First you have to create a random x variable and then compute y. I will add the code for that variables an the plot.
In the case of finding the maximum of f(x) you must know calculus or you can use a visual approach. Here the code:
library(ggplot2)
library(dplyr)
set.seed(123)
First we create a random variable x with the limits you mentioned:
#Data
x <- runif(100,0,10)
Now, we save the variable in a dataframe and compute y:
#Allocate data in a dataframe
df <- data.frame(x=x)
#Compute variable
df$y <- sqrt(((df$x^3)+3*((df$x)^2)+1)/((df$x^4)+5*(df$x)^3+7*(df$x)+9))
Finally, we plot:
#Plot
ggplot(df,aes(x=x,y=y))+
geom_point()+
scale_x_continuous(limits = c(0,10))
Output:
Values for x are randomly generated, if you have real values for x you should use those values.
How do I plot decision boundary from weight vector?
My original data is 2-dimensional but non-linearly separable so I used a polynomial transformation of order 2 and therefore I ended up with a 6-dimensional weight vector.
Here's the code I used to generate my data:
polar2cart <- function(theta,R,x,y){
x = x+cos(theta) * R
y = y+sin(theta) * R
c=matrix(x,ncol=1000)
c=rbind(c,y)
}
cart2polar <- function(x, y)
{
r <- sqrt(x^2 + y^2)
t <- atan(y/x)
c(r,t)
}
R=5
eps=5
sep=-5
c1<-polar2cart(pi*runif(1000,0,1),runif(1000,0,eps)+R,0,0)
c2<-polar2cart(-pi*runif(1000,0,1),runif(1000,0,eps)+R,R+eps/2,-sep)
data <- data.frame("x" = append(c1[1,], c2[1,]), "y" = append(c1[2,], c2[2,]))
labels <- append(rep(1,1000), rep(-1, 1000))
and here's how it is displayed (using ggplot2):
Thank you in advance.
EDIT: I'm sorry if I didn't provide enough information about the weight vector. The algorithm I'm using is pocket which is a variation of perceptron, which means that the output weight vector is the perpendicular vector that determines the hyper-plane in the feature space plus the bias . Therefore, the hyper-plane equation is , where are the variables. Now, since I used a polynomial transformation of order 2 to go from a 2-dimensional space to a 5-dimensional space, my variables are : and thus the equation for my decision boundary is:
So basically, my question is how do I go about drawing my decision boundary given
PS: I've found a solution while waiting, it might not be the best approach but, it gives the expected results. I'll share it as soon as I finish my project if anyone is interested. Meanwhile, I'd love to hear a better alternative.
I'm a new user to R, and I am trying to create a function that will simulate a random walk. The issue for me is trying to integrate some initial values smoothly. Say I have this basic function.
y(t) = y(t-2) + eps(t)
Epsilon (or eps(t)) will be the randomness factor. I want to define y(-1)=0, and y(0)=0.
Here is my code:
ran.walk=function(n){ # 'n' steps will be the input
eps=rnorm(n) # creates a vector taking random values from N(0,1)
y= c(eps[1], eps[2]) # this will set up my initial vector
for (i in 3:n){
ytemp = y[i-2] + eps[i] ## !!! problem is here. Details below !!!
y= c(y, ytemp)
}
return(y)
}
I'm trying to get this start adding y3, y4, y5, etc, but I think there is a flaw in this design... I'm not sure if I should just set up two separate lines, with an if statement: testing if n is even or odd, perhaps with:
if i%%2 == 1 #using modulus
Since,
y1= eps1,
y2= eps2,
y3= y1 + eps3,
y4= y2 + eps4,
y5= y3 + eps5 and so on...
Currently, I see the error in my code.
I have y1, and y2 concatenated, but I don't think it knows how to incorporate y[1]
Can I define beforehand somehow y[-1]=0, and y[0]=0 ? I tried this also and got an error.
Thank you kindly in advance for any assistance. This is first times attempting a for loop with recursion.
-N (sorry for any formatting issues, I had a lot of problems getting this question to go through)
I found that your odd and even series is independent one of the other. Assuming that it is the case, I just split the problem in two columns and use cumsum to get the random walk. The final data frame include the random numbers and the random walk, so you can compare it is working properly.
Hoping it helps
ran.walk=function(n) {
eps=rnorm(ceiling(n / 2)*2)
dim(eps) <- c(2,ceiling(n/2))
# since each series is independent, we can tally each one in its own
eps2 <- apply(eps, 1, cumsum)
# and just reorganize it
eps2 <- as.numeric(t(eps2))
rndwlk <- data.frame(rnd=as.numeric(eps), walk=eps2)
# remove the extra value if needed
rndwlk <- rndwlk[1:n,]
return(rndwlk)
}
ran.walk(13)
After taking a break with my piano, it came to me. It's funny how simple the answer becomes once you discover it... almost trivial.
Setting the initial value to be a vector, that is:
[y(1) = y(-1) + eps(1), y(2)= y(0) + eps(2)]
everything works out. It is still true that the evens and odds don't interact, but there is no reason to specify any of that.
The method to split the iterations with modulus, then concatenating it back into the main vector would also work, but is unnecessary and more complicated. Shorter is better for users and computers. As Einstein said, make it as simple as possible, but no simpler.
How should I interpret
How do I interpret this? One way is to take it as logn(logn) and other is . Both would be giving different answers.
For eg:
Taking base 2 and n=1024, in first case we get 10*10 as ans. In the second case, we get 10^10 as ans or am I doing something wrong?
From a programmer's viewpoing a good way to better understand a function is to plot it at different parts of its domain.
But what is the domain of f(x) := ln(x)^ln(x)? Well, given that the exponent is not an integer, the base cannot be smaller than 1. Why? Because ln(x) is negative for 0 < x < 1 and it is not even defined for x <= 0.
But what about x = 1. Given that ln(1) = 0, we would get 0^0, which is not defined either. So, let's plot f(x) for x between: 1.000001 and 1.1. We get:
The plot reveals that there would be no harm in extending the definition of f(x) at 1 in this way (let me use pseudocode here):
f(x) := if x = 1 then 1 else ln(x)^ln(x)
Now, let's see what happens for larger values of x. Here is a plot between 1 and 10:
This plot is also interesting because it exposes a singular behavior between 1 and 3, so let's plot that part of the domain to see it better:
There are a couple of questions that one could ask by looking at this plot. For instance, what is the value of x such that f(x)=1? Mm... this value is visibly between 2.7 and 2.8 (much closer to 2.7). And what number do we know that is a little bit larger than 2.7? This number should be related to the ln function, right? Well, ln is logarithm in base e and the number e is something like 2.71828182845904.... So, it looks like a good candidate, doesn't it? Let's see:
f(e) = ln(e)^ln(e) = 1^1 = 1!
So, yes, the answer to our question is e.
Another interesting value of x is the one where the curve has a minimum, which lies somewhere between 1.4 and 1.5. But since this answer is getting too long, I will stop here. Of course, you can keep plotting and answering your own questions as you happen to encounter them. And remember, you can use iterative numeric algorithms to find values of x or f(x) that, for whatever reason, appear interesting to you.
Because log(n^log n)=(log n)^2, I would assume that log n^log n should be interpreted as (log n)^(log n). Otherwise, there's no point in the exponentiation. But whoever wrote that down for you should have clarified.