Extracting rows based on the value

Extracting rows based on the value - r

I have tab delim text file which contains the following columns:
Probe A_sig A_Pval
ILMN_122 12.31 0.04
ILMN_456 56.12 0
ILMN_198 981.2 0.06
ILMN_980 876.0 0.001
ILMN_542 123.9 0.16
ILMN_567 134.1 0
ILMN_452 213.4 0.98
ILMN_142 543.8 0.04
ILMN_765 187.4 0.05
Now I want to take out those rows which has the Pval <.05. The output should look like
Probe A_sig A_Pval
ILMN_122 12.31 0.04
ILMN_980 876.0 0.001
ILMN_142 543.8 0.04
Can anyone please help me?

I'll answer this but it's a basic question that is probably repeated elsewhere on this list.
Load data.
DAT <- read.table(text="Probe A_sig A_Pval
ILMN_122 12.31 0.04
ILMN_456 56.12 0
ILMN_198 981.2 0.06
ILMN_980 876.0 0.001
ILMN_542 123.9 0.16
ILMN_567 134.1 0
ILMN_452 213.4 0.98
ILMN_142 543.8 0.04
ILMN_765 187.4 0.05", h=T)
You can use indexing as in:
DAT[DAT$A_Pval <.05, ]
However this returns the zero vales as well. That isn't what you're output looks like. If you don't want the zeros use logical operator & as well as in:
DAT[DAT$A_Pval <.05 & DAT$A_Pval!=0, ]
I suggest you take a look at some manuals and this (LINK) reference card to help get you started.

my_dataframe[my_dataframe$A_Pval < 0.05,]
The trailing comma is important.

Related

Reading non-uniform data into R

I struggling with reading non-uniform data into R.
I've achieved the following:
Used "readLines" to read the text file data in
Used "grep" to find the block of data that I want
Used the index from grep to create a variable (named "block") that contains only that block of data
All good so far - I now have the data I want. But - its a character variable with only one column that contains all the data.
This creates a sample of the variable I have made called "block" (first 3 rows):
line1 = c(" 114.24 -0.39 0.06 13.85 -0.06 1402.11 -1.48 0.0003 0.0000 35.468 1.02 -0.02 0.00 0 1 1 1 0 49.87 4 -290 0 0 -0.002 -0.010 0.155 999.00 11482.66 999.00 11482.66 16:52:24:119 255 13.89 50.00 0.00 -5.49 0.00")
line2 = c(" 114.28 -0.39 0.08 13.84 -0.06 1402.57 -1.48 0.0004 0.0000 35.479 1.29 -0.02 0.00 0 1 1 1 0 49.82 4 -272 0 0 -0.002 -0.011 0.124 999.00 11482.66 999.00 11482.66 16:52:24:150 255 13.89 50.00 0.00 -5.49 0.00")
line3 = c(" 114.31 -0.39 0.09 13.83 -0.06 1403.03 -1.47 0.0005 0.0000 35.492 1.42 -0.02 0.00 0 1 1 1 0 49.78 4 -263 0 0 -0.002 -0.011 0.046 999.00 11482.66 999.00 11482.66 16:52:24:197 255 13.89 50.00 0.00 -5.49 0.00")
block = c(line1,line2,line3)
My goal is to have this data as a data.frame with separate columns for each data point.
My attempts at using strsplit haven't helped (does the solution involve strsplit?)- what is the best approach here? Any suggestions/feedback welcome.
strsplit(block,"\s",fixed=F)

Either of the following should work for you:
## Creates a "data.table"
library(splitstackshape)
cSplit(data.table(x = block), "x", " ")
## Creates a "data.frame"
read.table(text = block, header = FALSE)
## Creates a character matrix
do.call(rbind, strsplit(block, "\\s+"))
## Like the above, but likely to be faster
library(stringi)
stri_split_regex(block, "\\s+", simplify = TRUE)
Note the "\\s+" for the last two options. The "+" is to match multiple spaces.

Actually - this looks like it might work.
Import raw data into R
But wanted to check if this was the best approach to this situation...?

How to efficiently grow large data in R

The product of one simulation is a large data.frame, with fixed columns and rows. I ran several hundreds of simulations, with each result stored in a separate RData file (for efficient reading).
Now I want to gather all those files together and create statistics for each field of this data.frame into the "cells" structure which is basically a list of vectors with . This is how I do it:
#colscount, rowscount - number of columns and rows from each simulation
#simcount - number of simulation.
#colnames - names of columns of simulation's data frame.
#simfilenames - vector with filenames with each simulation
cells<-as.list(rep(NA, colscount))
for(i in 1:colscount)
{
cells[[i]]<-as.list(rep(NA,rowscount))
for(j in 1:rows)
{
cells[[i]][[j]]<-rep(NA,simcount)
}
}
names(cells)<-colnames
addcells<-function(simnr)
# This function reads and appends simdata to "simnr" position in each cell in the "cells" structure
{
simdata<readRDS(simfilenames[[simnr]])
for(i in 1:colscount)
{
for(j in 1:rowscount)
{
if (!is.na(simdata[j,i]))
{
cells[[i]][[j]][simnr]<-simdata[j,i]
}
}
}
}
library(plyr)
a_ply(1:simcount,1,addcells)
The problem is, that this the
> system.time(dane<-readRDS(path.cat(args$rdatapath,pliki[[simnr]]))$dane)
user system elapsed
0.088 0.004 0.093
While
? system.time(addcells(1))
user system elapsed
147.328 0.296 147.644
I would expect both commands to have comparable execution times (or at least the latter be max 10 x slower). I guess I am doing something very inefficient there, but what? The whole cells data structure is rather big, it takes around 1GB of memory.
I need to transpose data in this way, because later I do many descriptive statistics on the results (like computing means, sd, quantiles, and maybe histograms), so it is important, that the data for each cell is stored as a (single-dimensional) vector.
Here is profiling output:
> summaryRprof('/tmp/temp/rprof.out')
$by.self
self.time self.pct total.time total.pct
"[.data.frame" 71.98 47.20 129.52 84.93
"names" 11.98 7.86 11.98 7.86
"length" 10.84 7.11 10.84 7.11
"addcells" 10.66 6.99 151.52 99.36
".subset" 10.62 6.96 10.62 6.96
"[" 9.68 6.35 139.20 91.28
"match" 6.06 3.97 11.36 7.45
"sys.call" 4.68 3.07 4.68 3.07
"%in%" 4.50 2.95 15.86 10.40
"all" 4.28 2.81 4.28 2.81
"==" 2.34 1.53 2.34 1.53
".subset2" 1.28 0.84 1.28 0.84
"is.na" 1.06 0.70 1.06 0.70
"nargs" 0.62 0.41 0.62 0.41
"gc" 0.54 0.35 0.54 0.35
"!" 0.42 0.28 0.42 0.28
"dim" 0.34 0.22 0.34 0.22
".Call" 0.12 0.08 0.12 0.08
"readRDS" 0.10 0.07 0.12 0.08
"cat" 0.10 0.07 0.10 0.07
"readLines" 0.04 0.03 0.04 0.03
"strsplit" 0.04 0.03 0.04 0.03
"addParaBreaks" 0.02 0.01 0.04 0.03
It looks that indexing the list structure takes a lot of time. But I can't make it array, because not all cells are numeric, and R doesn't easily support hash map...

R rownames(foo[bar]) prints as null but can be successfully changed - why?

I've written a script that works on a set gene-expression data.
I'll try to separate my post in the short question and the rather lengthy explanation (sorry about that long text block). I hope the short question makes sense in itself. The long explanation is simply to clarify if I don't get the point along in the short question.
I tried to aquire basic R skills and something that puzzles me occurred, and I didn't find any enlightment via google. I really don't understand this. I hope that by clarifying what is happening here I can better understand R.
That said I'm not a programmer so please bear with my bad code.
SHORT QUESTION:
When I have rownames(foo) e.g.
> print(rownames(foo))
"a" "b" "c" "d"
and I try to access it via print(rownames(foo[bar]) it prints it as null.
E.g
> print(rownames(foo[2]))
NULL
Here in the second answer Richie Cotton explains this as "[...] that where there aren't any names, [...]"
This would indicate to me, that either rownames(foo) is empty - which is clearly not the case as I can print it with "print(rownames(foo))" - or that this method of access fails.
However when I try to change the value at position bar, i get a warning message, that the replacement length wouldn't match. However the operation nevertheless succeeds - which pretty much proves, that this method of access is indeed successful. E.g.
> bar = 2
> rownames(foo[bar]) = some.vector(rab)
> print(rownames(foo[bar])
NULL
> print(rownames(foo))
"a" "something else" "c" "d"
Why is this working? Obviously the function can't properly access the position of bar in foo, as it prints it as empty.
Why the heck does it still replace the value successfully and not fail in a horrific way?
Or asked the other way around: When it successfully replaces the value at this position why is the print function not returning the value properly?
LONG BACKGROUND EXPLANATION:
The data source contains the number in the list, the entrez-id of the gene, the official gene symbol, the affimetrix probe id and then the increase or decrease values. It looks something like this:
No Entrez Symbol Probe_id Sample1_FoldChange Sample2_FoldChange
1 690244 Sumo2 1367452_at 1.02 0.19
Later when displaying the data I want it to print out only the gene symbol and the increases.
Now if there is no gene-symbol in the data set it is printed as "n/a", this is obviously of no value for me, as I can't determine which one of many genes it is.
So I made a first processing step, that only for this cases exchanges the "n/a" result with "n/a(12345) where 12345 is the entrez-id.
I've written the following script to do this. (Note as I'm not a programmer and I am new with R I doubt that it is pretty code. But that's not the point I want to discuss.)
no.symbol.idx <-which(rownames(expr.table) == "n/a")
c1 <- character (length(rownames(expr.table)))
c2 <- c1
for (x in 1:length(c1))
{
c1[x] <- "n/a ("
}
for (x in 1:length(c2))
{
c2[x] <- ")"
}
rownames(expr.table)[no.symbol.idx] <- paste(c1, (expr.table[no.symbol.idx , "Entrez"]),c2, sep="")
The script works and it does what it should do. However I get the following error message.
Warning message:
In rownames(expr.table)[no.symbol.idx] <- paste(c1, (expr.table[no.symbol.idx, :
number of items to replace is not a multiple of replacement length
To find out what happened here is i put some text output into the script.
no.symbol.idx <-which(rownames(expr.table) == "n/a")
c1 <- character (length(rownames(expr.table)))
c2 <- c1
for (x in 1:length(c1))
{
c1[x] <- "n/a ("
}
for (x in 1:length(c2))
{
c2[x] <- ")"
}
print("print(rownames(expr.table)):")
print(rownames(expr.table))
print("print(no.symbol.idx):")
print(no.symbol.idx)
print("print(rownames(expr.table[no.symbol.idx])):")
print(rownames(expr.table[no.symbol.idx]))
print("print(rownames(expr.table[14])):")
print(rownames(expr.table[14]))
print("print(rownames(expr.table[15])):")
print(rownames(expr.table[15]))
cat("print(expr.table[no.symbol.idx,\"Entrez\"]):\n")
print(expr.table[no.symbol.idx,"Entrez"])
rownames(expr.table)[no.symbol.idx] <- paste(c1, (expr.table[no.symbol.idx , "Entrez"]),c2, sep="")
print("print(rownames(expr.table)):")
print(rownames(expr.table))
print("print(rownames(expr.table[no.symbol.idx])):")
print(rownames(expr.table[no.symbol.idx]))
And I get the following output in the console.
[1] "print(rownames(expr.table)):"
[1] "Sumo2" "Cdc37" "Copb2" "Vcp" "Ube2d3" "Becn1" "Lypla2" "Arf1" "Gdi2" "Copb1" "Capns1" "Phb2" "Puf60" "Dad1" "n/a"
[1] "print(no.symbol.idx):"
[1] 15
[1] "print(rownames(expr.table[no.symbol.idx])):"
NULL
[1] "print(rownames(expr.table[14])):"
NULL
[1] "print(rownames(expr.table[15])):"
NULL
... (to be continued)
so obviously no.symbol.idx gets the right position for the n/a value.
When I try to print it however it claims that rownames for this position was empty and returns NULL.
When I try to access this position "by hand" and use expr.table[15] it also returns NULL.
This however has nothing to do with the n/a value as the same holds true for the value stored at position 14.
... (the continuation)
print(expr.table[no.symbol.idx,"Entrez"]):
[1] "116727"
[1] "print(rownames(expr.table)):"
[1] "Sumo2" "Cdc37" "Copb2" "Vcp" "Ube2d3" "Becn1" "Lypla2" "Arf1" "Gdi2"
[10] "Copb1" "Capns1" "Phb2" "Puf60" "Dad1" "n/a (116727)"
[1] "print(rownames(expr.table[no.symbol.idx])):"
NULL
and this is the result that surprises me. Despite this it is working. It claims everything would be NULL but the operation is successful.
I don't understand this.
EDIT:
Here are the results of the functions you wanted me tu run.
str(expr.table)
chr [1:15, 1:17] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "401" "690244" "114562" "60384" ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:15] "Sumo2" "Cdc37" "Copb2" "Vcp" ...
..$ : chr [1:17] "No" "Entrez" "Symbol" "Probe_id" ...
head(expr.table)
dput(head(expr.table,10))
structure(c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10",
"690244", "114562", "60384", "116643", "81920", "114558", "83510",
"64310", "29662", "114023", "Sumo2", "Cdc37", "Copb2", "Vcp",
"Ube2d3", "Becn1", "Lypla2", "Arf1", "Gdi2", "Copb1", "1367452_at",
"1367453_at", "1367454_at", "1367455_at", "1367456_at", "1367457_at",
"1367458_at", "1367459_at", "1367460_at", "1367461_at", "1.02000",
"-1.04000", "1.03000", "-0.12000", "-0.02000", "-0.03000", "0.09000",
"0.05000", "-0.09000", "0.16000", "0.19000", "0.11000", "-0.00425",
"0.52000", "0.46000", "0.42000", "0.20000", "0.05000", "0.21000",
"0.37000", "0.26000", "0.19000", "-0.03000", "0.35000", "0.34000",
"0.07000", "0.00156", "0.12000", "0.08000", "0.16000", "0.59000",
"0.20000", "-0.16000", "0.28000", "0.46000", "-0.15000", "0.00168",
"0.23000", "-0.01000", "0.10000", "0.05000", "0.12000", "-0.00522",
"0.58000", "0.23000", "0.06000", "0.01000", "0.07000", "-0.11000",
"0.23000", "-0.03", "0.08", "0.09", "0.08", "0.11", "0.03", "-0.08",
"0.02", "-0.05", "0.06", "0.03000", "-0.06000", "0.09000", "0.00940",
"0.11000", "-0.09000", "0.04000", "-0.04000", "-0.09000", "0.01000",
"0.04000", "-0.02000", "0.21000", "0.27000", "0.08000", "0.12000",
"0.06000", "0.26000", "0.04000", "0.40000", "0.05000", "0.05000",
"0.00897", "0.09000", "0.20000", "0.09000", "0.13000", "-0.03000",
"-0.08000", "-0.01000", "0.050000", "0.020000", "0.050000", "-0.005390",
"0.020000", "0.008080", "0.060000", "-0.030000", "-0.020000",
"-0.000406", "0.50", "0.11", "0.06", "0.19", "0.21", "0.32",
"0.15", "0.17", "0.14", "0.03", "-0.08000", "-0.11000", "-0.07000",
"0.03000", "-0.04000", "0.02000", "-0.00444", "-0.07000", "-0.13000",
"-0.11000", "0.25000", "0.15000", "0.22000", "0.74000", "0.39000",
"0.36000", "-0.08000", "0.18000", "0.00865", "0.43000"), .Dim = c(10L,
17L), .Dimnames = list(c("Sumo2", "Cdc37", "Copb2", "Vcp", "Ube2d3",
"Becn1", "Lypla2", "Arf1", "Gdi2", "Copb1"), c("No", "Entrez",
"Symbol", "Probe_id", "AA_HD_24h_FoldChange", "AAF_HD_24h_FoldChange",
"APAP_HD_24h_FoldChange", "BBZ_HD_24h_FoldChange", "BCT_HD_24h_FoldChange",
"BEA_HD_24h_FoldChange", "CBP_HD_24h_FoldChange", "CCL4_HD_24h_FoldChange",
"CPA_HD_24h_FoldChange", "CSP_HD_24h_FoldChange", "DEN_HD_24h_FoldChange",
"LS_HD_24h_FoldChange", "PCT_HD_24h_FoldChange")))
And here I added the file I use for debugging. This is the data it reads into expr.table.
No Entrez Symbol Probe_id AA_HD_24h_FoldChange AAF_HD_24h_FoldChange APAP_HD_24h_FoldChange BBZ_HD_24h_FoldChange BCT_HD_24h_FoldChange BEA_HD_24h_FoldChange CBP_HD_24h_FoldChange CCL4_HD_24h_FoldChange CPA_HD_24h_FoldChange CSP_HD_24h_FoldChange DEN_HD_24h_FoldChange LS_HD_24h_FoldChange PCT_HD_24h_FoldChange
1 690244 Sumo2 1367452_at 1.02 0.19 0.26 0.59 0.05 -0.03 0.03 0.04 0.05 0.05 0.5 -0.08 0.25
2 114562 Cdc37 1367453_at -1.04 0.11 0.19 0.2 0.12 0.08 -0.06 -0.02 0.05 0.02 0.11 -0.11 0.15
3 60384 Copb2 1367454_at 1.03 -4.25E-003 -0.03 -0.16 -5.22E-003 0.09 0.09 0.21 8.97E-003 0.05 0.06 -0.07 0.22
4 116643 Vcp 1367455_at -0.12 0.52 0.35 0.28 0.58 0.08 9.40E-003 0.27 0.09 -5.39E-003 0.19 0.03 0.74
5 81920 Ube2d3 1367456_at -0.02 0.46 0.34 0.46 0.23 0.11 0.11 0.08 0.2 0.02 0.21 -0.04 0.39
6 114558 Becn1 1367457_at -0.03 0.42 0.07 -0.15 0.06 0.03 -0.09 0.12 0.09 8.08E-003 0.32 0.02 0.36
7 83510 Lypla2 1367458_at 0.09 0.2 1.56E-003 1.68E-003 0.01 -0.08 0.04 0.06 0.13 0.06 0.15 -4.44E-003 -0.08
8 64310 Arf1 1367459_at 0.05 0.05 0.12 0.23 0.07 0.02 -0.04 0.26 -0.03 -0.03 0.17 -0.07 0.18
9 29662 Gdi2 1367460_at -0.09 0.21 0.08 -0.01 -0.11 -0.05 -0.09 0.04 -0.08 -0.02 0.14 -0.13 8.65E-003
10 114023 Copb1 1367461_at 0.16 0.37 0.16 0.1 0.23 0.06 0.01 0.4 -0.01 -4.06E-004 0.03 -0.11 0.43
11 29156 Capns1 1367462_at -0.23 0.32 0.11 0.13 -0.38 -0.15 -0.08 0.15 -0.18 0.2 0.13 -0.18 0.09
12 114766 Phb2 1367463_at 1.01E-003 0.29 0.41 0.59 0.05 -0.07 -0.13 -0.18 -0.28 -0.21 -0.22 -0.2 0.39
13 84401 Puf60 1367464_at -0.05 0.33 0.14 0.3 0.03 0.02 8.96E-003 2.96E-003 -8.63E-003 -0.13 0.07 -0.15 0.44
14 192275 Dad1 1367465_at 0.22 -0.21 -0.19 -0.24 -0.47 -0.01 -0.09 0.68 -0.06 -0.08 0.02 -0.29 -0.25
401 116727 n/a 1367852_s_at -0.34 -0.12 -0.06 -0.11 0.13 0.03 0.07 -0.18 0.08 -0.2 0.04 -0.04 0.06
Rownames is filled with the Gene symbols e.g Sumo2 for No 1.
What the script should do (and does) is for Entry No 401 it should change the name from n/a to n/a(116727). However the afforementioned warning occurs and I want to understand what's going on here.

I assume you are using a data.frame called foo. Underneath the hood, a data.frame is a list of vectors each of which is of the same length.
So foo[2] refers to the second column of foo as a dataframe, foo[,2] refers to the second column of foo as a vector. rownames(foo) is a vector and its second term is rownames(foo)[2]
If you want the second column of foo as a dataframe then you can use foo[2] or foo[,2,drop=FALSE] and print(rownames(foo[2])) will give you the same result as print(rownames(foo))
If you want the second row of foo as a dataframe then you need a comma as in foo[2,] and print(rownames(foo[2,])) will give you the same result as print(rownames(foo)[2])
If you want to change the name of the second row of foo in the original foo dataframe then try something like:
rownames(foo)[2] = "example of new name for row 2"

transpose 250,000 rows into columns in R

I always transpose by using t(file) command in R.
But i it is not running properly (not running at all) on big data file (250,000 rows and 200 columns). Any ideas.
I need to calculate correlation between 2nd row (PTBP1) with all other rows (except 8 rows including header). In order to do this I transpose rows to columns and then use cor function.
But I struck at transpose fn. Any help would be really appreciated!
I copied example from one of the post in stackoverflow (They are also almost discussing the same problem but seems no answer yet!)
ID A B C D E F G H I [200 columns]
Row0$-1 0.08 0.47 0.94 0.33 0.08 0.93 0.72 0.51 0.55
Row02$1 0.37 0.87 0.72 0.96 0.20 0.55 0.35 0.73 0.44
Row03$ 0.19 0.71 0.52 0.73 0.03 0.18 0.13 0.13 0.30
Row04$- 0.08 0.77 0.89 0.12 0.39 0.18 0.74 0.61 0.57
Row05$- 0.09 0.60 0.73 0.65 0.43 0.21 0.27 0.52 0.60
Row06-$ 0.60 0.54 0.70 0.56 0.49 0.94 0.23 0.80 0.63
Row07$- 0.02 0.33 0.05 0.90 0.48 0.47 0.51 0.36 0.26
Row08$_ 0.34 0.96 0.37 0.06 0.20 0.14 0.84 0.28 0.47
........
250,000 rows

Use a matrix instead. The only advantage of a dataframe over a matrix is the capacity to have different classes in the columns and you clearly do not have that situation, since a transposed dataframe could not support such a result.

I don't get why you want to transpose the data.frame. If you just use cor it doesn't matter if your data is in rows or columns.
Actually, it is one of the major advantages of R that it doen's matter if your data fits in the classical row-column pattern as SPSS and others programs require data to be.
There are numerous ways to correlate the first row with all other rows (I don't get which rows you want to exclude). One is using a loop (here the loop is implicit in the call to one of the *apply family functions):
lapply(2:(dim(fn)[1]), function(x) cor(fn[1,],fn[x,]))
Note that I expect you data.frame to ba called fn. To skip some rows change the 2 to the number you want. Furthermore, I would probably use vapply here.
I hope this answer points you in the correct direction and that is to not use t() if you absolutely don't need it.

Ordering Table A based on Rank of Table B in R

pretty newb question here, but I have not been able to track down a solution for some time:
I have an XTS object of trading indicators (indicate) for stock data that looks like
A XOM MSFT
2000-11-30 -0.59 0.22 0.10
2000-12-29 0.55 -0.23 0.05
2001-01-30 -0.52 0.09 -0.10
And a table with an identical index for the corresponding period returns (return) that looks like
A XOM MSFT
2000-11-30 -0.15 0.10 0.03
2000-12-29 0.03 -0.05 0.02
2001-01-30 -0.04 0.02 -0.05
I have sorted the indicator table and had it return the column name with the following code:
indicate.label <- colnames(indicate)
indicate.rank <- t(apply(indicate, 1, function(x) indicate.label[order(-x)]))
indicate.rank <- xts(indicate.rank, order.by = index(returns))
Which gives the table (indicate.rank) of the symbol names ranked by their trading indicator:
1 2 3
2000-11-30 XOM MSFT A
2000-12-29 A MSFT XOM
2001-01-30 XOM A MSFT
I would like to also have a table that gives the period returns based on the indicator rank:
2000-11-30 0.10 0.03 -0.15
2000-12-29 0.03 0.02 -0.05
2001-01-30 0.02 -0.04 -0.05
I cannot figure out how to call the correct symbol for all rows or just sort the table return based on the order of indicate.
Thank you for any suggestions.
Trevor J

I'm not particularly satisfied with this solution, but it works.
row.rank <- t(apply(indicate, 1, order, decreasing=TRUE))
indicate.rank <- return.rank <- indicate # pre-allocate
for(i in 1:NROW(indicate.rank)) {
indicate.rank[i,] <- colnames(indicate)[row.rank[i,]]
return.rank[i,] <- return[i,row.rank[i,]]
}
It would probably be easier to handle this if the returns and the indicators for each symbol were in the same object, but I don't know how that would fit with the rest of your workflow.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Extracting rows based on the value - r

my_dataframe[my_dataframe$A_Pval < 0.05,] The trailing comma is important.

Related

Reading non-uniform data into R

How to efficiently grow large data in R

R rownames(foo[bar]) prints as null but can be successfully changed - why?

transpose 250,000 rows into columns in R

Ordering Table A based on Rank of Table B in R

Categories

Resources