read a text file in R

read a text file in R - r

It sounds stupid ! but I could not find a proper way to read this text file.
I have tried read.table and fread functions. But no success, the columns are not matches with the data :
m = fread(meq,fill = T,sep = " ")
m = read.table(meq,fill = T,comment.char="-",sep = "")

That file contains some metadata at the top and isn't in a standard format which can be parsed as a dataframe easily. One solution is to read it in as a character vector, do some manipulations, and then read in the resulting file:
meq <- "LS_FLS_YUN_IECED3_Tower_100pctAv.meq"
lines <- readLines(meq)
lines <- lines[-(1:5)]
lines <- gsub("\\|", "", lines)
lines <- gsub(" +", " ", lines)
file <- tempfile()
writeLines(lines, file)
data.table::fread(file, sep = " ", fill = TRUE)
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
# 1: 1 t 279.1 307.1 351.1 429.2 524.8 539.1 540.5 550.3 558.5 NA
# 2: 2 v_hor_size 6.6 6.8 7.3 9.6 11.8 13.3 13.4 14.9 16.4 NA
# 3: 3 Elevation 4.1 4.1 4.3 5.1 6.2 6.7 6.7 7.3 7.9 NA
# 4: 4 Mf_x e.1n1 43.8 61.4 106.6 270.0 330.2 447.4 461.3 573.9 689.9
# 5: 5 Mf_y e.1n1 107.1 148.0 236.8 493.8 603.8 746.3 762.0 881.7 994.3
# ---
#766: 766 Mt_030 e13n2 5694.5 5524.4 5559.4 6850.9 8377.6 9381.7 9510.6 10592.5 11771.5
#767: 767 Mt_060 e13n2 9223.2 8757.3 8448.3 9210.8 11263.4 11821.5 11901.7 12606.0 13398.2
#768: 768 Mt_090 e13n2 11582.0 10912.8 10380.1 10898.0 13326.5 13686.1 13745.4 14298.0 14960.4
#769: 769 Mt_120 e13n2 11658.8 11015.8 10529.9 11142.4 13625.4 13989.1 14046.3 14564.8 15166.2
#770: 770 Mt_150 e13n2 9386.4 8973.3 8741.8 9551.2 11679.6 12116.0 12177.5 12712.0 13312.7

This is my solution :
read.table(text = mgsub::mgsub(readLines(meq),c("\\|", " e"," e"),c("","e","e")),fill = T,comment.char = "-",sep = "",na.strings ="", stringsAsFactors= F,skip = 2)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
1 W<f6>hler slope: 3.5 4.0 5.0 8.0 8.0 10.0 10.25 12.4 14.95
2 Reference cycles: 10000000.0 10000000.0 10000000.0 10000000.0 2000000.0 2000000.0 2000000.00 2000000.0 2000000.00
3 1 t 279.1 307.1 351.1 429.2 524.8 539.1 540.50 550.3 558.50
4 2 v_hor_size 6.6 6.8 7.3 9.6 11.8 13.3 13.40 14.9 16.40
5 3 Elevation 4.1 4.1 4.3 5.1 6.2 6.7 6.70 7.3 7.90
6 4 Mf_xe.1n1 43.8 61.4 106.6 270.0 330.2 447.4 461.30 573.9 689.90
7 5 Mf_ye.1n1 107.1 148.0 236.8 493.8 603.8 746.3 762.00 881.7 994.30
8 6 Mf_xye.1n1 71.7 98.6 156.9 324.7 397.0 488.6 498.60 574.4 644.80
9 7 Mf_000e.1n1 107.1 148.0 236.8 493.8 603.8 746.3 762.00 881.7 994.30
10 8 Mf_030e.1n1 99.6 137.2 219.7 460.4 563.0 698.3 713.30 828.3 938.10
11 9 Mf_060e.1n1 72.4 100.6 165.6 371.9 454.8 581.1 595.50 706.6 814.60
12 10 Mf_090e.1n1 43.8 61.4 106.6 270.0 330.2 447.4 461.30 573.9 689.90
13 11 Mf_120e.1n1 56.1 78.5 130.2 302.2 369.5 486.4 500.10 610.4 722.30
14 12 Mf_150e.1n1 90.6 125.8 202.9 429.6 525.4 653.7 668.00 777.4 882.00
15 13 Mf_xe.2n1 2591.1 2521.6 2510.2 2854.9 3491.1 3777.4 3819.20 4199.4 4652.10
16 14 Mf_ye.2n1 1407.2 1385.4 1606.3 2993.9 3661.1 4509.0 4603.60 5327.9 6016.10
17 15 Mf_xye.2n1 2337.5 2239.9 2157.8 2262.2 2766.3 3002.7 3045.10 3432.6 3845.10
18 16 Mf_000e.2n1 1407.2 1385.4 1606.3 2993.9 3661.1 4509.0 4603.60 5327.9 6016.10
19 17 Mf_030e.2n1 1692.9 1668.5 1798.8 2846.6 3480.9 4238.0 4325.60 5008.6 5674.50
20 18 Mf_060e.2n1 2298.1 2247.6 2275.5 2781.8 3401.7 3849.5 3909.90 4430.1 5003.00
21 19 Mf_090e.2n1 2591.1 2521.6 2510.2 2854.9 3491.1 3777.4 3819.20 4199.4 4652.10
22 20 Mf_120e.2n1 2410.8 2334.5 2300.9 2561.2 3132.0 3411.9 3457.50 3902.0 4461.90
23 21 Mf_150e.2n1 1862.8 1799.9 1810.7 2631.1 3217.5 3955.5 4040.70 4700.3 5339.30
24 22 Mf_xe.3n1 6414.2 6261.0 6252.6 7146.9 8739.6 9466.1 9570.50 10506.0 11580.60

Related

Removing a column from a matrix

I'm a bit new to R and wanting to remove a column from a matrix by the name of that column. I know that X[,2] gives the second column and X[,-2] gives every column except the second one. What I really want to know is if there's a similar command using column names. I've got a matrix and want to remove the "sales" column, but X[,-"sales"] doesn't seem to work for this. How should I do this? I would use the column number only I want to be able to use it for other matrices later, which have different dimensions. Any help would be much appreciated.

I'm not sure why all the answers are solutions for data frames and not matrices.
Per #Sotos's and #Moody_Mudskipper's comments, here is an example with the builtin state.x77 data matrix.
dat <- head(state.x77)
dat
#> Population Income Illiteracy Life Exp Murder HS Grad Frost Area
#> Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
#> Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
#> Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
#> Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
#> California 21198 5114 1.1 71.71 10.3 62.6 20 156361
#> Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
# for removing one column
dat[, colnames(dat) != "Area"]
#> Population Income Illiteracy Life Exp Murder HS Grad Frost
#> Alabama 3615 3624 2.1 69.05 15.1 41.3 20
#> Alaska 365 6315 1.5 69.31 11.3 66.7 152
#> Arizona 2212 4530 1.8 70.55 7.8 58.1 15
#> Arkansas 2110 3378 1.9 70.66 10.1 39.9 65
#> California 21198 5114 1.1 71.71 10.3 62.6 20
#> Colorado 2541 4884 0.7 72.06 6.8 63.9 166
# for removing more than one column
dat[, !colnames(dat) %in% c("Area", "Life Exp")]
#> Population Income Illiteracy Murder HS Grad Frost
#> Alabama 3615 3624 2.1 15.1 41.3 20
#> Alaska 365 6315 1.5 11.3 66.7 152
#> Arizona 2212 4530 1.8 7.8 58.1 15
#> Arkansas 2110 3378 1.9 10.1 39.9 65
#> California 21198 5114 1.1 10.3 62.6 20
#> Colorado 2541 4884 0.7 6.8 63.9 166
#be sure to use `colnames` and not `names`
names(state.x77)
#> NULL
Created on 2020-06-27 by the reprex package (v0.3.0)

my favorite way:
# create data
df <- data.frame(x = runif(100),
y = runif(100),
remove_me = runif(100),
remove_me_too = runif(100))
# remove column
df <- df[,!names(df) %in% c("remove_me", "remove_me_too")]
so this dataframe:
> df
x y remove_me remove_me_too
1 0.731124508 0.535219259 0.33209113 0.736142042
2 0.612017350 0.404128030 0.84923974 0.624543223
3 0.415403559 0.369818154 0.53817387 0.661263087
4 0.199780006 0.679946936 0.58782429 0.085624708
5 0.343304259 0.892128112 0.02827132 0.038203599
becomes this:
> df
x y
1 0.731124508 0.535219259
2 0.612017350 0.404128030
3 0.415403559 0.369818154
4 0.199780006 0.679946936
5 0.343304259 0.892128112

As always in R there are many potential solutions. You can use the package dplyr and select() to easily remove or select columns in a data frame.
df <- data.frame(x = runif(100),
y = runif(100),
remove_me = runif(100),
remove_me_too = runif(100))
library(dplyr)
select(df, -remove_me, -remove_me_too) %>% head()
#> x y
#> 1 0.35113636 0.134590652
#> 2 0.72545356 0.165608839
#> 3 0.81000067 0.090696049
#> 4 0.29882204 0.004602398
#> 5 0.93492918 0.256870750
#> 6 0.03007377 0.395614901
You can read more about dplyr and its verbs here.

As a general case, if you remove so many columns that only one column remains, R will convert it to a numeric vector. You can prevent it by setting drop = FALSE.
(df <- data.frame(x = runif(6),
y = runif(6),
remove_me = runif(6),
remove_me_too = runif(6)))
# x y remove_me remove_me_too
# 1 0.4839869 0.18672217 0.0973506 0.72310641
# 2 0.2467426 0.37950878 0.2472324 0.80133920
# 3 0.4449471 0.58542547 0.8185943 0.57900456
# 4 0.9119014 0.12089776 0.2153147 0.05584816
# 5 0.4979701 0.04890334 0.7420666 0.44906667
# 6 0.3266374 0.37110822 0.6809380 0.29091746
df[, -c(3, 4)]
# x y
# 1 0.4839869 0.18672217
# 2 0.2467426 0.37950878
# 3 0.4449471 0.58542547
# 4 0.9119014 0.12089776
# 5 0.4979701 0.04890334
# 6 0.3266374 0.37110822
# Result is a numeric vector
df[, -c(2, 3, 4)]
# [1] 0.4839869 0.2467426 0.4449471 0.9119014 0.4979701 0.3266374
# Keep the matrix type
df[, -c(2, 3, 4), drop = FALSE]
# x
# 1 0.4839869
# 2 0.2467426
# 3 0.4449471
# 4 0.9119014
# 5 0.4979701
# 6 0.3266374

R generate bins from a data frame respecting blanks

I need to generate bins from a data.frame based on the values of one column. I have tried the function "cut".
For example: I want to create bins of air temperature values in the column "AirTDay" in a data frame:
AirTDay (oC)
8.16
10.88
5.28
19.82
23.62
13.14
28.84
32.21
17.44
31.21
I need the bin intervals to include all values in a range of 2 degrees centigrade from that initial value (i.e. 8-9.99, 10-11.99, 12-13.99...), to be labelled with the average value of the range (i.e. 9.5, 10.5, 12.5...), and to respect blank cells, returning "NA" in the bins column.
The output should look as:
Air_T (oC) TBins
8.16 8.5
10.88 10.5
5.28 NA
NA
19.82 20.5
23.62 24.5
13.14 14.5
NA
NA
28.84 28.5
32.21 32.5
17.44 18.5
31.21 32.5
I've gotten as far as:
setwd('C:/Users/xxx')
temp_data <- read.csv("temperature.csv", sep = ",", header = TRUE)
TAir <- temp_data$AirTDay
Tmin <- round(min(TAir, na.rm = FALSE), digits = 0) # is start at minimum value
Tmax <- round(max(TAir, na.rm = FALSE), digits = 0)
int <- 2 # bin ranges 2 degrees
mean_int <- int/2
int_range <- seq(Tmin, Tmax + int, int) # generate bin sequence
bin_label <- seq(Tmin + mean_int, Tmax + mean_int, int) # generate labels
temp_data$TBins <- cut(TAir, breaks = int_range, ordered_result = FALSE, labels = bin_label)
The output table looks correct, but for some reason it shows a sequential additional column, shifts column names, and collapse all values eliminating blank cells. Something like this:
Air_T (oC) TBins
1 8.16 8.5
2 10.88 10.5
3 5.28 NA
4 19.82 20.5
5 23.62 24.5
6 13.14 14.5
7 28.84 28.5
8 32.21 32.5
9 17.44 18.5
10 31.21 32.5
Any ideas on where am I failing and how to solve it?

v<-ceiling(max(dat$V1,na.rm=T))
breaks<-seq(8,v,2)
labels=seq(8.5,length.out=length(s)-1,by=2)
transform(dat,Tbins=cut(V1,breaks,labels))
V1 Tbins
1 8.16 8.5
2 10.88 10.5
3 5.28 <NA>
4 NA <NA>
5 19.82 18.5
6 23.62 22.5
7 13.14 12.5
8 NA <NA>
9 NA <NA>
10 28.84 28.5
11 32.21 <NA>
12 17.44 16.5
13 31.21 30.5
This result follows the logic given: we have
paste(seq(8,v,2),seq(9.99,v,by=2),sep="-")
[1] "8-9.99" "10-11.99" "12-13.99" "14-15.99" "16-17.99" "18-19.99" "20-21.99"
[8] "22-23.99" "24-25.99" "26-27.99" "28-29.99" "30-31.99"
From this we can tell that 19.82 will lie between 18 and 20 thus given the value 18.5, similar to 10.88 being between 10-11.99 thus assigned the value 10.5

Calculate pseudo-median for many columns

I am trying to calculate the pseudo-median (Hodges-Lehmann estimator) calculated from the wilcox.test function in R between several different columns in my dataset (8 separate fields/columns).
> head(fantasyproj2)
V1 V2 V3 V4 V5 V6 V7 V8
1 25 25.87 20.35 26.65 27.20 27.0 17.970 27.70
2 24 27.48 19.81 19.57 22.20 27.5 16.350 20.04
3 22 19.89 17.62 21.99 19.12 26.0 22.484 23.70
4 21 17.72 16.09 15.55 18.60 18.5 17.450 14.59
5 21 21.56 17.90 18.46 20.80 23.0 16.540 19.76
6 20 NA 17.73 15.84 19.00 20.5 19.080 15.05
They are all numeric, and do include some missing values (the missing value have no data within them and have been changed to NA in R). when I run the code:
fantasyproj$hodges.lehmann <- apply(fantasyproj2[,c(1,2,3,4,5,6,7,8)],1, function(x) wilcox.test(x, conf.int=TRUE, na.action=na.exclude)$estimate)
I get the error:
Error in uniroot(wdiff, c(mumin, mumax), tol = 1e-04, zq = qnorm(alpha/2, : f() values at end points not of opposite sign
There is not much literature on this error out there except where it says to add the argument exact = TRUEto the statement, but that does nto help. Any help would be greatly appreciated!

R - pareto like summary for histograms

I would like to generate summary of a histogram in a table format. With plot=FALSE, i am able to get histogram object.
> hist(y,plot=FALSE)
$breaks
[1] 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8
$counts
[1] 48 1339 20454 893070 1045286 24284 518 171 148
[10] 94 42 42 37 25 18 21 14 5
$density
[1] 0.00012086929 0.00337174962 0.05150542703 2.24884871999 2.63214538964
[6] 0.06114978928 0.00130438111 0.00043059685 0.00037268032 0.00023670236
[11] 0.00010576063 0.00010576063 0.00009317008 0.00006295276 0.00004532598
[16] 0.00005288032 0.00003525354 0.00001259055
$mids
[1] 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.9 3.1 3.3 3.5 3.7
$xname
[1] "y"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
Is there a way to summarize this object like pareto chart summary. (Below summary is for different data, including this as an example)
Pareto chart analysis for counts
Frequency Cum.Freq. Percentage Cum.Percent.
c 2294652 2294652 33.689225770 33.68923
f 1605467 3900119 23.570868362 57.26009
g 896893 4797012 13.167848880 70.42794
i 464220 5261232 6.815505091 77.24345
b 365399 5626631 5.364651985 82.60810
j 332239 5958870 4.877809219 87.48591
h 215313 6174183 3.161145249 90.64705
l 129871 6304054 1.906717637 92.55377
e 107001 6411055 1.570948818 94.12472
k 104954 6516009 1.540895526 95.66562
d 103648 6619657 1.521721321 97.18734
m 56172 6675829 0.824696377 98.01203
o 51093 6726922 0.750128391 98.76216
n 49320 6776242 0.724097865 99.48626
p 32321 6808563 0.474524881 99.96079
q 1334 6809897 0.019585291 99.98037
r 620 6810517 0.009102609 99.98947
s 247 6810764 0.003626362 99.99310
u 182 6810946 0.002672056 99.99577
t 162 6811108 0.002378424 99.99815
z 126 6811234 0.001849885 100.00000

You can write a wrapper function that will convert the relevant parts of the hist output into a data.frame:
myfun <- function(x) {
h <- hist(x, plot = FALSE)
data.frame(Frequency = h$counts,
Cum.Freq = cumsum(h$counts),
Percentage = h$density/sum(h$density),
Cum.Percent = cumsum(h$density)/sum(h$density))
}
Here's an example on the built-in iris dataset:
myfun(iris$Sepal.Width)
# Frequency Cum.Freq Percentage Cum.Percent
# 1 4 4 0.026666667 0.02666667
# 2 7 11 0.046666667 0.07333333
# 3 13 24 0.086666667 0.16000000
# 4 23 47 0.153333333 0.31333333
# 5 36 83 0.240000000 0.55333333
# 6 24 107 0.160000000 0.71333333
# 7 18 125 0.120000000 0.83333333
# 8 10 135 0.066666667 0.90000000
# 9 9 144 0.060000000 0.96000000
# 10 3 147 0.020000000 0.98000000
# 11 2 149 0.013333333 0.99333333
# 12 1 150 0.006666667 1.00000000

Why subsetting rows with 'apply' in data frame doesn't work in R

I have a data that looks like this.
Name|ID|p72|p78|p51|p49|c36.1|c32.1|c32.2|c36.2|c37
hsa-let-7a-5p|MIMAT0000062|9.1|38|12.7|185|8|4.53333333333333|17.9|23|63.3
hsa-let-7b-5p|MIMAT0000063|11.3|58.6|27.5|165.6|20.4|8.5|21|30.2|92.6
hsa-let-7c|MIMAT0000064|7.8|40.2|9.6|147.8|11.8|4.53333333333333|15.4|17.7|62.3
hsa-let-7d-5p|MIMAT0000065|4.53333333333333|27.7|13.4|158.1|8.5|4.53333333333333|14.2|13.5|50.5
hsa-let-7e-5p|MIMAT0000066|6.2|4.53333333333333|4.53333333333333|28|4.53333333333333|4.53333333333333|5.6|4.7|12.8
hsa-let-7f-5p|MIMAT0000067|4.53333333333333|4.53333333333333|4.53333333333333|78.2|4.53333333333333|4.53333333333333|6.8|4.53333333333333|8.9
hsa-miR-15a-5p|MIMAT0000068|4.53333333333333|70.3|10.3|147.6|4.53333333333333|4.53333333333333|21.1|30.2|100.8
hsa-miR-16-5p|MIMAT0000069|9.5|562.6|60.5|757|25.1|4.53333333333333|89.4|142.9|613.9
hsa-miR-17-5p|MIMAT0000070|10.5|71.6|27.4|335.1|6.3|10.1|51|51|187.1
hsa-miR-17-3p|MIMAT0000071|4.53333333333333|4.53333333333333|4.53333333333333|17.2|4.53333333333333|4.53333333333333|9.5|4.53333333333333|7.3
hsa-miR-18a-5p|MIMAT0000072|4.53333333333333|14.6|4.53333333333333|53.4|4.53333333333333|4.53333333333333|9.5|25.5|29.7
hsa-miR-19a-3p|MIMAT0000073|4.53333333333333|11.6|4.53333333333333|42.8|4.53333333333333|4.53333333333333|4.53333333333333|5.5|17.9
hsa-miR-19b-3p|MIMAT0000074|8.3|93.3|15.8|248.3|4.53333333333333|6.3|44.7|53.2|135
hsa-miR-20a-5p|MIMAT0000075|4.53333333333333|75.2|23.4|255.7|6.6|4.53333333333333|43.8|38|130.3
hsa-miR-21-5p|MIMAT0000076|6.2|19.7|18|299.5|6.8|4.53333333333333|49.9|68.5|48
hsa-miR-22-3p|MIMAT0000077|40.4|128.4|65.4|547.1|56.5|33.4|104.9|84.1|248.3
hsa-miR-23a-3p|MIMAT0000078|58.3|99.3|58.6|617.9|36.6|21.4|107.1|125.5|120.9
hsa-miR-24-1-5p|MIMAT0000079|4.53333333333333|4.53333333333333|4.53333333333333|9.2|4.53333333333333|4.53333333333333|4.53333333333333|4.9|4.53333333333333
hsa-miR-24-3p|MIMAT0000080|638.2|286.9|379.5|394.4|307.8|240.4|186|234.2|564
What I want to do is to simply pick rows where all the values is greater than 10.
But why this code of mine only report the last one?
The data clearly showed that there are more rows that satisfy this condition.
> dat<-read.delim("http://dpaste.com/1215552/plain/",sep="|",na.strings="",header=TRUE,blank.lines.skip=TRUE,fill=FALSE)
But why this code of mine only report the last one?
> dat[apply(dat[, -1], MARGIN = 1, function(x) all(x > 10)), ]
Name ID p72 p78 p51 p49 c36.1 c32.1 c32.2 c36.2 c37
19 hsa-miR-24-3p MIMAT0000080 638.2 286.9 379.5 394.4 307.8 240.4 186 234.2 564
What is the right way to do it?
Update:
alexwhan solution works. But I wonder how can I generalized his approach
so that it can handle data with missing values (NA)
dat<-read.delim("http://dpaste.com/1215354/plain/",sep="\t",na.strings="",heade‌r=FALSE,blank.lines.skip=TRUE,fill=FALSE)

Since you're including your ID column (which is a factor) in the all(), it's getting messed up. Try:
dat[apply(dat[, -c(1,2)], MARGIN = 1, function(x) all(x > 10)), ]
# Name ID p72 p78 p51 p49 c36.1 c32.1 c32.2 c36.2 c37
# 16 hsa-miR-22-3p MIMAT0000077 40.4 128.4 65.4 547.1 56.5 33.4 104.9 84.1 248.3
# 17 hsa-miR-23a-3p MIMAT0000078 58.3 99.3 58.6 617.9 36.6 21.4 107.1 125.5 120.9
# 19 hsa-miR-24-3p MIMAT0000080 638.2 286.9 379.5 394.4 307.8 240.4 186.0 234.2 564.0
EDIT
For the case where you have NA, you can just just use the na.rm argument for all(). Using your new data (from the comment):
dat<-read.delim("http://dpaste.com/1215354/plain/",sep="\t",na.strings="",header=FALSE,blank.lines.skip=TRUE,fill=FALSE)
dat[apply(dat[, -c(1,2)], MARGIN = 1, function(x) all(x > 10, na.rm = T)), ]
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
# 7 hsa-miR-15a-5p MIMAT0000068 NA 70.3 10.3 147.6 NA NA 21.1 30.2 100.8
# 16 hsa-miR-22-3p MIMAT0000077 40.4 128.4 65.4 547.1 56.5 33.4 104.9 84.1 248.3
# 17 hsa-miR-23a-3p MIMAT0000078 58.3 99.3 58.6 617.9 36.6 21.4 107.1 125.5 120.9
# 19 hsa-miR-24-3p MIMAT0000080 638.2 286.9 379.5 394.4 307.8 240.4 186.0 234.2 564.0
# 20 hsa-miR-25-3p MIMAT0000081 19.3 78.6 25.6 84.3 14.9 16.9 19.1 27.2 113.8
# 21 hsa-miR-26a-5p MIMAT0000082 NA 22.8 31.0 561.2 12.4 NA 67.0 55.8 48.9

ANother idea is to transform your data ton long format( or molton format). I think it is even better to avoid missing values problem with:
library(reshape2)
dat.m <- melt(dat,id.vars=c('Name','ID'))
dat.m$value <- as.numeric(dat.m$value)
library(plyr)
res <- ddply(dat.m,.(Name,ID), summarise, keepme = all(value > 10))
res[res$keepme,]
# Name ID keepme
# 16 hsa-miR-22-3p MIMAT0000077 TRUE
# 17 hsa-miR-23a-3p MIMAT0000078 TRUE
# 19 hsa-miR-24-3p MIMAT0000080 TRUE

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

read a text file in R - r

It sounds stupid ! but I could not find a proper way to read this text file. I have tried read.table and fread functions. But no success, the columns are not matches with the data : m = fread(meq,fill = T,sep = " ") m = read.table(meq,fill = T,comment.char="-",sep = "")

Related

Removing a column from a matrix

R generate bins from a data frame respecting blanks

Calculate pseudo-median for many columns

R - pareto like summary for histograms

Why subsetting rows with 'apply' in data frame doesn't work in R

Categories

Resources