reading gctx file in R

reading gctx file in R - r

I am trying to read a gctx file extracted from LINCS source for gene expression analysis. The codes for eading the file are provided at the link below.
https://github.com/cmap/l1ktools.
I am using the script provided and I have sourced the script. however when I tried the function parse.gctx it gives me following error:
ds <- parse.gctx("../L1000 Data/zspc_n40172x22268.gctx")
reading ../L1000 Data/zspc_n40172x22268.gctx
Error in h5checktypeOrOpenLoc(file, readonly = TRUE) :
Error in h5checktypeOrOpenLoc(). Cannot open file. File 'C:\L1000 Data\zspc_n40172x22268.gctx' does not exist.
How can I resolve this issue and read my gctx file?

Since you're getting a 'file does not exist' error, I think the problem is because you have a space in the path to the file you're trying to read (specifically, in "L1000 Data"); if you remove the space in the path it should parse properly.
In other words, try renaming your "L1000 Data" folder so that instead of:
ds <- parse.gctx("../L1000 Data/zspc_n40172x22268.gctx")
you have something along the lines of:
ds <- parse.gctx("../L1000_Data/zspc_n40172x22268.gctx")

Related

How to get passed the following error "Error in readLines(filestocopy) : 'con' is not a connection"?

I am new to coding and very new to this forum, so I hope my request makes sense.
I am trying to select images listed in a .csv file and to copy them to a new folder. The pictures and the .csv file are both in the folder GRA04. The .csv file contain only one column with the picture names.
I used the following code:
#set working directory
setwd("E:/2019/GRA04")
#create and identify a new folder in R
targetdir <- dir.create("GRA04_age")<br/>
#find the files you want to copy
filestocopy <- read.csv("age.csv", header=FALSE) #read csv as data table (only one column, each raw being a file name)
filestocopy_v <- readLines(filestocopy)#convert data table in character vector
filestocopy_v #shows the character vector
#copy the files to the new folder
file.copy(filestocopy_v, targetdir, recursive = TRUE)
When reaching the line
filestocopy_v <- readLines(filestocopy)
I get this error message:
Error in readLines(filestocopy) : 'con' is not a connection
I looked online for solutions with no luck. I ran this code before (or else something similar... didn't back it up...) and it worked fine, so I am not sure what is happening...
Thanks!

Out of interest, would the following now do what you're trying to achieve?
filestocopy_v <- filestocopy[[1]]

Use of wildcards with readtext()

A basic question. I have a bunch of transcripts (.docx files) I want to read into a corpus. I use readtext() to read in single files no problem.
dat <- readtext("~/ownCloud/NLP/interview_1.docx")
As soon as I put "*.docx" in my readtext statement it spits an error.
dat <- readtext("~/ownCloud/NLP/*.docx")
Error: '/var/folders/bl/61g7ngh55vs79cfhfhnstd4c0000gn/T//RtmpWD6KSx/readtext-aa71916b691c0cf3cabc73a2e04a45f7/word/document.xml' does not exist.
In addition: Warning message:
In utils::unzip(file, exdir = path) : error 1 in extracting from zip file
Why the reference to a zip file? I have only .docx files in the directory.

I was able to reproduce the same problem. The issue was there are some hidden/temp .docx files in that folder, if you delete them and then try the code it works.
To see the hidden files, go to the folder from where you are reading docx files and based on your OS select a way to show them. On my mac I used
CMD + SHIFT + .
Once you delete them, try the code again and it should work
library(readtext)
dat <- readtext("~/ownCloud/NLP/*.docx")

Deleting old output files to avoid any errors while re-running the R codes

I was using this package called SNPRelate. This package uses plinkhapmap.bed, plinkhapmap.fam and plinkhapmap.bim files that can be downloded here. My problem is: when I run the code below, it creates test1.gds file and when I delete the test1.gds file and want to re-run the code it generates this error Error in createfn.gds(out.gdsfn) :
The file 'mypath/Dropbox/Public/SNPRelate/test2.gds' has been created or opened.
When I change the file name test1.gds to test2.gds or some new name for output file, I don't get this error. How do I close this file completely so I can re-run the code and get the output file (keeping the same file name) again?
library("SNPRelate")
bed.fn <- "plinkhapmap.bed"
fam.fn <- "plinkhapmap.fam"
bim.fn <- "plinkhapmap.bim"
snpgdsBED2GDS(bed.fn, fam.fn, bim.fn, "test1.gds")
snpgdsSummary("test1.gds")
genofile <- snpgdsOpen("test1.gds")
snpset <- snpgdsLDpruning(genofile, ld.threshold=0.2)
snpset.id <- unlist(snpset)

Open a dta file in R

I am trying to open a Stata .dta file which is compressed into winrar in R. Here are my codes:
library(foreign)
setwd("C:/Users/ASUS/Desktop/Data on oil/Oil discovery")
data <- read.dta("oil_discovery")
and I get :
Error in read.dta("oil_discovery") : unable to open file: 'No such file or directory'
I think that my problem is coming from the assignment of my working directory but I don't know how to manage it.

You need to specify the full file name to read.dta. This includes the file ending. That is, instead of
data <- read.dta("oil_discovery")
you need to write
data <- read.dta("oil_discovery.dta")
If there is an additional problem with the compression, I would imagine that the error message will be different. However, Error in read.dta("oil_discovery") : unable to open file: 'No such file or directory' very explicitly points out that the current error is that the file oil_discovery is not found.
A good way to check if the name or path is causing the error is to use choose.files(). That is, run the following line:
data <- read.dta(choose.files())
This will open a pop-up window where you can manually select the file. If this works, then the name of the file was misspecified.

library(haven)
data <- read_dta("**.dta")
View(data)

Importing manual gating from FlowJo to R (flow cytometry analysis)

I am experiencing an issue with reading an .xls or .wspt files into R. This is a table of flow cytometry manual gating schema. my code is as follows:
flowData = system.file("extdata",package="flowWorkspace")
file = list.files(flowData,pattern="manual.xls"/"manual.wspt", full=TRUE)
ws = openWorkspace(file)
When I try to read with openWorkspace, the .xls file gives an error:
Start tag expected. "(" not found.
I have seen this error in another post, but it doesn't seem to explain my case.
while, for opening the .wspt file, i receive an error:
error in data.frame...arguments imply differing number of rows:1,0.
Both of these files (.xls and .wspt) contain the same information. I just wanted to try to read both of them.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

reading gctx file in R - r

Related

How to get passed the following error "Error in readLines(filestocopy) : 'con' is not a connection"?

Use of wildcards with readtext()

Deleting old output files to avoid any errors while re-running the R codes

Open a dta file in R

Importing manual gating from FlowJo to R (flow cytometry analysis)

Categories

Resources