"Too few positive probabilities" error in R [closed] - r
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I wrote a code in R, but it produces a silly mistake. The first error is "Too few positive probabilities" and this leads to NAs. Therefore, the code does not work. Can you please take a look and let me know what is wrong? Here are the first 5 rows and the headings of the data (since I do not know how to upload a text file. Please tell me how if you do)
year month day n_cases n_controls weekd leapyr
1999 1 1 127 62 6 0
1999 1 2 88 46 7 0
1999 1 3 26 15 1 0
1999 1 4 606 275 2 0
1999 1 5 479 252 3 0
and here is the R code
##########
a<-read.table("e29.txt",header=T)
attach(a)
cases<-a[,4]# fourth column in data "Cases"
data<-cases[1:2555]
weeklydata<-matrix(data,7,365)
y=apply(weeklydata,2,sum)
#
T<-length(y)
N<-1000
a<-0.98
pfstate<-matrix(0,T+1,N)
pfomega<-matrix(0,T+1,N)
pfphi<-matrix(0,T+1,N)#storge of phi
pfb<-matrix(0,T+1,N)#storge of b
wts<-matrix(0,T+1,N)
wnorm<-matrix(0,T+1,N)
set.seed(046)
pfstate[1,]<-rnorm(N,0,100)#rep(0,N)#
pfomega[1,]<-runif(N,0,1)
pfb[1,]<-runif(N,0,5)
wts[1,]<-rep(1/N,N)
for(t in 2:(T+1)){
##compute means and variances of the particles cloud for sigma and omega
meanomega<-weighted.mean(pfomega[t-1,],wts[t-1,])
varomega<-weighted.mean((pfomega[t-1,]-meanomega)^2,wts[t-1,])
meanb<-weighted.mean(pfb[t-1,],wts[t-1,])
varb<-weighted.mean((pfb[t-1,]-meanb)^2,wts[t-1,])
##compute the parameters of gamma kernel
muomega<-a*pfomega[t-1,]+(1-a)*meanomega
var2omega<-(1-a^2)*varomega
alphaomega<-muomega^2/var2omega
betaomega<-muomega/var2omega
mub<-a*pfb[t-1,]+(1-a)*meanb
var2b<-(1-a^2)*varb
alphab<-mub^2/var2b
betab<-mub/var2b
##1.1 draw the auxiliary indicator varibales
probs<-wts[t-1,]*dpois(y[t-1],exp(pfstate[t-1,]))
auxInd<-sample(N,N,replace=TRUE,prob=probs)
##1.2 draw the values of variances of sigma and omega and delta
pfomega[t,]<-rgamma(N,shape=alphaomega[auxInd],rate= betaomega[auxInd])
pfb[t,]<-rgamma(N,shape=alphab[auxInd],rate= betab[auxInd])
pfphi[t,]<-(pfb[t,]-1)/(1+pfb[t,])
##1.3 draw the states
pfstate[t,]<-rnorm(N,mean=pfphi[t,]*pfstate[t-1,auxInd],sd=sqrt(pfomega[t,]))
##compute the weigths
wts[t,]<-exp(dpois(y[t-1],exp(pfstate[t,]),log=TRUE)-
dpois(y[t-1],exp(pfstate[t-1,auxInd]),log=TRUE))
#print(wts)
wnorm[t,]<-wts[t,]/sum(wts[t,])
#print(wnorm)
}
### The first error occurs here
Error in sample.int(x, size, replace, prob) :
too few positive probabilities
ESS<-rep(0,T+1)
ESSthr<-N/2
for(t in 2:(T+1)){
ESS[t]<-1/sum(wnorm[t,]^2)
if(ESS[t]<ESSthr){
pfstate[t,]<-sample(pfstate[t,],N,replace=T,prob=wnorm[t,])
wnorm[t,]<-1/N
}
}
#THe second error occurs here
#Error in if (ESS[t] < ESSthr) { : missing value where TRUE/FALSE needed
The problem seems to be here:
probs<-wts[t-1,]*dpois(y[t-1],exp(pfstate[t-1,]))
auxInd<-sample(N,N,replace=TRUE,prob=probs)
It looks like your vector of probabilities becomes all 0s at some point. This could happen, for example if y[t-1] is very large. For example dpois(300,3) evaluates to 0.
By the way, this problem could be an indication that something is wrong conceptually in your experiment design. Since I don't know what you are doing, I can't help here.
Anyway, if you are confident that the algorithm is correct, but you want to avoid this error, one solution is to use the log form of dpois, and then adding a constant, since all that matters for the call to sample is relative weights. Something like this might work:
lprobs<-dpois(y[t-1],exp(pfstate[t-1,]),log=T)
lprobs<-lprobs-max(lprobs)
probs<-wts[t-1,]*exp(lprobs)
Related
How to fix linear model fitting error in S-plus
I am trying to fit values in my algorithm so that I could predict a next month's number. I am getting a No data for variable errror when clearly I've defined what the objects are that I am putting into the equation. I've tried to place them in vectors so that it could use one vector as a training data set to predict the new values. Current script has worked for me for a different dataset but for some reason isn't working here. The data is small so I was wondering if that has anything to do with it. The data is: Month io obs Units Sold 12 in 1 114 1 in 2 29 2 in 3 105 3 in 4 30 4 in 5 I'm trying to predict Units Sold with the code below matt<-TEST1 isdf<-matt[matt$month<=3,] isdf<-na.omit(isdf) osdf<-matt[matt$Units.Sold==4,] lmfit<-lm(Units.Sold~obs+Month,data=isdf,na.action=na.omit) predict(lmFit,osdf[1,1]) I am expecting to be able to place lmfit in predict and get an output.
how to increase size of character so as not to truncate the imported field [closed]
Closed. This question needs debugging details. It is not currently accepting answers. Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question. Closed 5 years ago. Improve this question Input data is of format A=integer, B=text [word count max 500] Importing this data set into R truncates the second column to fit chr. Is there a different class that will ensure no truncation or a method to increase the size of chr to accommodate the entire text? (conceptually equivalent to a TEXT vs VARCHAR in sql) xdoc <- read.csv("./data/abtest2.csv", header = TRUE, sep = ",", as.is = TRUE) head(xdoc) A 1 601004351600 B 1 adsfj al;ds fj;sd jf;klsdj f dsfdfsdf sdf sdf sdf as a dag dfgh tyutr erigkdj fajklsdf j;sdkl ;klajdfsiljuaeiodgjdfl;gdASo ri[3iocvjilgjdfi gjksjfl jgeoutoihjkvhlkasj;aljdsgkjdfghkdm,gfn;lkja;ja;drfjgkihyuirhl jkjfdkl hjgasdhgdfjkgksdjkj r...
I think it's something about the way in which you're viewing the files. longwords <- replicate(10,paste( sample(letters,600,replace=TRUE),collapse="")) nchar(longwords) ## 600 600 600 600 ... dd <- data.frame(n=1:10,w=longwords) write.csv(dd,file="tmp.csv",row.names=FALSE) Now read the data file back in -- it's the same as when it was written out xdoc <- read.csv("tmp.csv",as.is=TRUE) nchar(xdoc$w) ## [1] 600 600 600 600 600 ... I don't know what kind of limits there are on string length in R other than memory size, but they're long. Perhaps this note from ?as.character is relevant ... ? > ‘as.character’ breaks lines in language objects at 500 characters, and inserts newlines. Prior to 2.15.0 lines were truncated. So something else, either in your viewing procedure or in the way you've processed the data, is messing you up. head(xdoc) n 1 1 2 2 3 3 4 4 5 5 6 6 w 1 llscwhauaiqfqcftzfqujwqefathrchnneqwkcoktrpnebpylyjkoiqyscegbmdwmiegivulxnqxjlrcjiwrsfbltdrcymcmpeolxpexxcjhrggqjuphahysgocgjtsafueqzrnvcsofeuxfworytsnfrclsxozrmoitlpfunvmoomgijudjrjngynbrpfotbxzktjbctyafofvyjeegwuiavxrzhropgdtkbwsszwetxcgrrsymcjwstrmrqkaqlwuccikpbtjjwssvxvrrldzfjdqtythlhhzslxvhxrojskaxxuhcnmqppbymxvmqzbyhtzqfgljelvcmsmwsdbytqkvhkgyhreomxohpjtcbiffeuqgwrolwqgmmxevifadnqkxgbentgxazfspzztpuulvpqrbioelzhimyxzhrmdltlmynfpkaqldvwhaicmykjmlxmffrqlukqiwdmhrwygkricdozrggopnsknwduqxrmzovnrzcumddwtqzipfwmdijqgnclenqemecguxqfvbfyxcwpswmzrcvnuqohruphgkzljxgovddliiwdsrfobimtcboljtkxcmzfqwi 2 xuevtjfterzujzmauuvbwkszsbvcmyllddxnebwxgbwnqzlxhsppyxfnynjqkbzzuypxqaselnvwciusswranngvzmxgoxpjuawyaxxgtuisnifdcuqukluqlpwaqznbvlgltryvliwpqwmzrssadzocbiputgsyvfatwdhrbpjnhawdfqcssfkpqimyebfihcmkphsaybnyukzdjlggbkmjkogszslcossstvcehuyunrqapaggmvosouccuzpwjcyyqyizkyzqbcbsnsuewjkeicclfbxhlmishlxggnpluoovhlhcvxqqebzihrhtwjsbvrstddpqqpevjxvmprgthqkdiqgzbzvxjthnjuxvmbpijyvnxuwgemztexcpvouuasdikegxfiqdscjsgpjuvkxeweelfrvfuhllswebmxktpofxusqaqzdrbrybytufvuavknulcnikckayqhoxxsbjhwxcidtpxiwjwqpecmseutimbkfyjfbslhbvdrquefmeqggtbfogjoozbrcfsucxokbdvinnuoolriszkrgbeplswmrujgejsolidvyrdutqnejgrlkeoqqpguks 3 ohhbcsacskcpfjptbbvddwuzwbguedjqyowktvrinuzifawboyqgomhqrxahkbbuoyvsfbwwqstreomtzmdlszdndeurvehobdkzzqffxqgpgkcnqbwrrdcewlfbouveqpbwruoqnmbbodjbhetantlffwzpiefnwreimkoxjwswhdpncqgyvaulwehcuyyngidtdpscxysjqcydwbrqvhpjejudsondgltrrmmydrlnbqjaamdfnivundbupuaialqhuvivfiwtzmdahrtsgvaooardpdiwcinxzvrjrfufmjpsmtugrzqfibdyzgznahftzhlraqubtgnbbrrlursixsgzggbxqrjaqpzgmekqrtyawavhbmlcfcluhvwxfwcvjmxmlwkkzsleayftbxiufysupsygpoklqckxcwfpscleyidikrqvudpjzsqebwodmjkndzagemlofmznaoamedremdtrtbvrqmncxcjoydarnqfukqrapgcewncmhrdmpehiosurelobpqxhfiqksimmvcllcsdnefsvkpcwpokzgnpyluvescbztdlsnyduaxnjlrqgtpgkhclexnbd 4 njpjvhthxdkwrhjvzgnjmceketvjoxeaorxyasibcdhgallwbtvdixviamkrjgrgrwmnkxnihclcuxwoyitwnstlfpqqdwaqtilbmihzshpreexixbrqqhzblmkiptpieqhptczxocchzhbdweualevdoqdzbjdcxlosbgvexcbgwopmrvlqoquknwgcoulqdpmvnlsaxchtqxzzdqnnxukbrfvlfyhssidxsmyqkwmghzdkleccscagvkdioydhjyihgesczherzyoiolgmgyefriokqrxvhbpbzszugnogafoonprykardrjhuqrtdacydaefhrhrgvelehknavjuspgvulgaixgfjrgnmzsagbrxekwwegidduogyxohrfsvcahohggbhabwzkgxpqqrabwnkdeprfkrzlqvqwlqocfohhokxgjjvixvszkdhvszunsdqzzcgezdgvluholijbuitornmpjvggkqsqxhlnxsbujtjpriksthpmfqvhcnhvrnxxpjfrrulzjnfbmlemtvlemhtwfzdypabgcljgegdiehklzfgocsfbfmammpceocxddwpqlrmcvjbldkx 5 hawfcjfxgucbgcjggkfplsgcsncipmjnrwatlhwkrjokunomffyvmrvdkenbwahirvimlauvtefealzgkxihtfitevmffqtizbkvdidmgyshuvvwugpddwxxijtexrlnelbhftpczkxlwecmzxwpzfmaosixyzejbgandcuuiknattwgnopcrpfdhgdxdgnvumacvhnwgvlwmplnjroenogsjlrqroivbvibicxprylsoamxmhcumsbdqhvhwsmizemfnvxvlpbrhdqjyotgteomiymxqsyvcimxyxdyiplmohjnoxamibvselbbujdfnvwmycggsvqmhdrcwddpmqlgtuujqaadtinfuwiyghofqkxbgqdqqvqknhfehxhnamlwvingtaqdwmtgvsxplthzhlzolsjlwuvnxrzioxjvxlwcyssfrxljmikbqjfhevynsetwysnevxsczqbekfrpbbomvpphewrhprpabefhssuooubmxjhksqkljgglkewjkxafrorjuwlwjxyvioywztmaaruyekwuwlajfybievzchqviuueoaxosoeglxgbvlrehhnrmgmljruvygkvp 6 wirtvzltqsseidfrlezfrmaakmroyeztniyoiwwumqhuzqehlymaumrxqupxsfxmgmvoesvcgnavlamsqxbnzhesqsdsjajpowlevkwpifqlyinnifvsmyymrpfbmobrealrommitauwzxzkoohoppqwhfgfyqkdienrejptrvmaaoxwvdkmxeddfzynbiayrpfvrayjuvvcekbnfjtqyohyvkivoggovrodqyqxzbzyplmisqcreigwbjvabwoyfjfkgxssnafhicpercfievxgbbgpbqvfeeduletbmanmfckimsbeegeqrtfdmsqftqtmfwkfnjikxzipsjpbjcjncssmajqisellewvhunzgnmncplslsiuqngxecktxwzuyvwvlhdolkoarzcemluebjcvxckolwyebtxodqsbaleppqdluinwlafciqbfgfawcpsgocliyzeqxlkcwvptgicrtuffqdypeqojtfooaapvstolguhdgrwinzwxiglsxenkeghjdpitkxowqdtmekbqfpvtfrhpmebnrkvwdytzrzuigzyesyhssdaoircggxozljfrtoylsmnkkvfxk >
finding the most frequent item using bigmemory techniques and parallel computing? [closed]
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 8 years ago. Improve this question How can I find which months have the most frequent delays without using regression? The following csv is a sample of a 100MB file. I know I should use bigmemory techniques but wasn't sure how to approach this. Here months are stored as integers not factor. Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance,TaxiIn,TaxiOut,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay 2006,1,11,3,743,745,1024,1018,US,343,N657AW,281,273,223,6,-2,ATL,PHX,1587,45,13,0,,0,0,0,0,0,0 2006,1,11,3,1053,1053,1313,1318,US,613,N834AW,260,265,214,-5,0,ATL,PHX,1587,27,19,0,,0,0,0,0,0,0 2006,1,11,3,1915,1915,2110,2133,US,617,N605AW,235,258,220,-23,0,ATL,PHX,1587,4,11,0,,0,0,0,0,0,0 2006,1,11,3,1753,1755,1925,1933,US,300,N312AW,152,158,126,-8,-2,AUS,PHX,872,16,10,0,,0,0,0,0,0,0 2006,1,11,3,824,832,1015,1015,US,765,N309AW,171,163,132,0,-8,AUS,PHX,872,27,12,0,,0,0,0,0,0,0 2006,1,11,3,627,630,834,832,US,295,N733UW,127,122,108,2,-3,BDL,CLT,644,6,13,0,,0,0,0,0,0,0 2006,1,11,3,825,820,1041,1021,US,349,N177UW,136,121,111,20,5,BDL,CLT,644,4,21,0,,0,0,0,20,0,0 2006,1,11,3,942,945,1155,1148,US,356,N404US,133,123,121,7,-3,BDL,CLT,644,4,8,0,,0,0,0,0,0,0 2006,1,11,3,1239,1245,1438,1445,US,775,N722UW,119,120,103,-7,-6,BDL,CLT,644,4,12,0,,0,0,0,0,0,0 2006,1,11,3,1642,1645,1841,1845,US,1002,N104UW,119,120,105,-4,-3,BDL,CLT,644,4,10,0,,0,0,0,0,0,0 2006,1,11,3,1836,1835,NA,2035,US,1103,N425US,NA,120,NA,NA,1,BDL,CLT,644,0,17,0,,1,0,0,0,0,0 2006,1,11,3,NA,1725,NA,1845,US,69,0,NA,80,NA,NA,NA,BDL,DCA,313,0,0,1,A,0,0,0,0,0,0
Let's say your data.frame is called dd. If you want to see the total number of weather delays for each month across all years you can do delay <- aggregate(WeatherDelay~Month, dd, sum) delay[order(-delay$WeatherDelay),]
Is this closer to what you want? I don't know R well enough to sum the rows, but this at least aggregates them. I am learning, too! delays <- read.csv("tmp.csv", stringsAsFactors = FALSE) delay <- aggregate(cbind(ArrDelay, DepDelay, WeatherDelay, NASDelay, SecurityDelay, LateAircraftDelay) ~ Month, delays, sum) delay It outputs: Month ArrDelay DepDelay WeatherDelay NASDelay SecurityDelay LateAircraftDelay 1 1 10 -16 0 0 0 0 2 2 -31 -2 0 0 0 0 3 3 9 -4 0 20 0 0 Note: I changed your document a bit to provide some diversity on the Months column: Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance,TaxiIn,TaxiOut,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay 2006,1,11,3,743,745,1024,1018,US,343,N657AW,281,273,223,6,-2,ATL,PHX,1587,45,13,0,,0,0,0,0,0,0 2006,1,11,3,1053,1053,1313,1318,US,613,N834AW,260,265,214,-5,0,ATL,PHX,1587,27,19,0,,0,0,0,0,0,0 2006,2,11,3,1915,1915,2110,2133,US,617,N605AW,235,258,220,-23,0,ATL,PHX,1587,4,11,0,,0,0,0,0,0,0 2006,2,11,3,1753,1755,1925,1933,US,300,N312AW,152,158,126,-8,-2,AUS,PHX,872,16,10,0,,0,0,0,0,0,0 2006,1,11,3,824,832,1015,1015,US,765,N309AW,171,163,132,0,-8,AUS,PHX,872,27,12,0,,0,0,0,0,0,0 2006,1,11,3,627,630,834,832,US,295,N733UW,127,122,108,2,-3,BDL,CLT,644,6,13,0,,0,0,0,0,0,0 2006,3,11,3,825,820,1041,1021,US,349,N177UW,136,121,111,20,5,BDL,CLT,644,4,21,0,,0,0,0,20,0,0 2006,1,11,3,942,945,1155,1148,US,356,N404US,133,123,121,7,-3,BDL,CLT,644,4,8,0,,0,0,0,0,0,0 2006,3,11,3,1239,1245,1438,1445,US,775,N722UW,119,120,103,-7,-6,BDL,CLT,644,4,12,0,,0,0,0,0,0,0 2006,3,11,3,1642,1645,1841,1845,US,1002,N104UW,119,120,105,-4,-3,BDL,CLT,644,4,10,0,,0,0,0,0,0,0 2006,3,11,3,1836,1835,NA,2035,US,1103,N425US,NA,120,NA,NA,1,BDL,CLT,644,0,17,0,,1,0,0,0,0,0 2006,1,11,3,NA,1725,NA,1845,US,69,0,NA,80,NA,NA,NA,BDL,DCA,313,0,0,1,A,0,0,0,0,0,0
Fisher test more than 2 groups
Major Edit: I decided to rewrite this question since my original was poorly put. I will leave the original question below to maintain a record. Basically, I need to do Fisher's Test on tables as big as 4 x 5 with around 200 observations. It turns out that this is often a major computational challenge as explained here (I think, I can't follow it completely). As I use both R and Stata I will frame the question for both with some made-up data. Stata: tabi 1 13 3 27 46 \ 25 0 2 5 3 \ 22 2 0 3 0 \ 19 34 3 8 1 , exact(10) You can increase exact() to 1000 max (but it will take maybe a day before returning an error). R: Job <- matrix(c(1,13,3,27,46, 25,0,2,5,3, 22,2,0,3,0, 19,34,3,8,1), 4, 5, dimnames = list(income = c("< 15k", "15-25k", "25-40k", ">40k"), satisfaction = c("VeryD", "LittleD", "ModerateS", "VeryS", "exstatic"))) fisher.test(Job) For me, at least, it errors out on both programs. So the question is how to do this calculation on either Stata or R? Original Question: I have Stata and R to play with. I have a dataset with various categorical variables, some of which have multiple categories. Therefore I'd like to do Fisher's exact test with more than 2 x 2 categories i.e. apply Fisher's to a 2 x 6 table or a 4 x 4 table. Can this be done with either R or Stata ? Edit: whilst this can be done in Stata - it will not work for my dataset as I have too many categories. Stata goes through endless iterations and even being left for a day or more does not produce a solution. My question is really - can R do this, and can it do it quickly ?
Have you studied the documentation of R function fisher.test? Quoting from help("fisher.test"): For 2 by 2 cases, p-values are obtained directly using the (central or non-central) hypergeometric distribution. Otherwise, computations are based on a C version of the FORTRAN subroutine FEXACT which implements the network developed by Mehta and Patel (1986) and improved by Clarkson, Fan and Joe (1993). This is an example given in the documentation: Job <- matrix(c(1,2,1,0, 3,3,6,1, 10,10,14,9, 6,7,12,11), 4, 4, dimnames = list(income = c("< 15k", "15-25k", "25-40k", "> 40k"), satisfaction = c("VeryD", "LittleD", "ModerateS", "VeryS"))) fisher.test(Job) # Fisher's Exact Test for Count Data # # data: Job # p-value = 0.7827 # alternative hypothesis: two.sided
As far as Stata is concerned, your original statement was totally incorrect. search fisher leads quickly to help tabulate twoway and the help for the exact option explains that it may be applied to r x c as well as to 2 x 2 tables the very first example in the same place of Fisher's exact test underlines that Stata is not limited to 2 x 2 tables. It's a minimal expectation anywhere on this site that you try to read basic documentation. Please!
Poisson Table in R
I am trying to generate a Poisson Table in R for two events, one with mean 1.5 (lambda1) and the other with mean 1.25 (lambda2). I would like to generate the probabilities in both cases for x=0 to x=7+ (7 or more). This is probably quite simple but I can't seem to figure out how to do it! I've managed to create a data frame for the table but I don't really know how to input the parameters as I've never written a function before: name <- c("0","1","2","3","4","5","6","7+") zero <- mat.or.vec(8,1) C <- data.frame(row.names=name, "0"=zero, "1"=zero, "2"=zero, "3"=zero, "4"=zero, "5"=zero, "6"=zero, "7+"=zero) I am guessing I will need some "For" loops and will involve dpois(x,lambda1) at some point. Can somebody help please?
I'm assuming these events are independent. Here's one way to generate a table of the joint PMF. First, here are the names you've defined, along with the lambdas: name <- c("0","1","2","3","4","5","6","7+") lambda1 <- 1.5 lambda2 <- 1.25 We can get the marginal probabilities for 0-6 by using dpois, and the marginal probability for 7+ using ppois and lower.tail=FALSE: p.x <- c(dpois(0:6, lambda1), ppois(7, lambda1, lower.tail=FALSE)) p.y <- c(dpois(0:6, lambda2), ppois(7, lambda2, lower.tail=FALSE)) An even better way might be to create a function that does this given any lambda. Then you just take the outer product (really, the same thing you would do by hand, outside of R) and set the names: p.xy <- outer(p.x, p.y) rownames(p.xy) <- colnames(p.xy) <- name Now you're done: 0 1 2 3 4 5 0 6.392786e-02 7.990983e-02 4.994364e-02 2.080985e-02 6.503078e-03 1.625770e-03 1 9.589179e-02 1.198647e-01 7.491546e-02 3.121478e-02 9.754617e-03 2.438654e-03 2 7.191884e-02 8.989855e-02 5.618660e-02 2.341108e-02 7.315963e-03 1.828991e-03 3 3.595942e-02 4.494928e-02 2.809330e-02 1.170554e-02 3.657982e-03 9.144954e-04 4 1.348478e-02 1.685598e-02 1.053499e-02 4.389578e-03 1.371743e-03 3.429358e-04 5 4.045435e-03 5.056794e-03 3.160496e-03 1.316873e-03 4.115229e-04 1.028807e-04 6 1.011359e-03 1.264198e-03 7.901240e-04 3.292183e-04 1.028807e-04 2.572018e-05 7+ 4.858139e-05 6.072674e-05 3.795421e-05 1.581426e-05 4.941955e-06 1.235489e-06 6 7+ 0 3.387020e-04 1.094781e-05 1 5.080530e-04 1.642171e-05 2 3.810397e-04 1.231628e-05 3 1.905199e-04 6.158140e-06 4 7.144495e-05 2.309303e-06 5 2.143349e-05 6.927908e-07 6 5.358371e-06 1.731977e-07 7+ 2.573935e-07 8.319685e-09 You could have also used a loop, as you originally suspected, but that's a more roundabout way to the same solution.