Convert BED file to WIG file

Convert BED file to WIG file - r

Do you know any way to convert easily (either by R, or with other program) a BED file to WIG?
Can you give me some guidelines?

Take a look at the rtracklayer package and specifically at the following man pages:
?import
?export

Here is a specific example of how I would convert .bed to .wig. As #Paolo implied, it's a straight-forward procedure:
library(rtracklayer) #bioconductor
bed_loaded <- import(con="~/Downloads/my_bed.bed.gz", format="bed") #no need to unzip .gz
# bed_loaded <- import.bed(con="~/Downloads/my_bed.bed") #if you unzip
export.wig(object=bed_loaded, con="~/Downloads/bed2wig.wig")
Note that you import and export both have methods (wig, bed, bigwig or bw, etc.). You may directly use them without specifying the method but specifying format argument.
This GitHub tutorial will be helpful.

Related

Use magrittr to pipe download.file() to another function

How can I use magrittr to pipe the output of download.file() directly to readxl() without first saving to a temporary location?
For example, I have the following code:
download.file(www, method="curl") %>%
read_excel(x, sheet ="List 1", range="A3:L1902") -> cw
This gives me an error because I am missing the destfile= argument... any ideas?

I tried the idea of connections but from my searches readxldoesn't support reading from urls (you can look here and here). However, I found here something that might help you.
The rio package have a wrapper around read_excel which allow the use of urls.
You can even add the argument sheet to chose which sheet to load. In addition, from my experience, if you know the file extension you'll use - add the format argument.
install.packages("rio") # if needed
df <- rio::import("https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls",
format = "xls", sheet = "SDTM Terminology 2018-03-30")

LIWC2015 import in r

I use LIWC2015 as student.
I would like to use it with R.
I found the package LIWCalike with which it is possible to use LIWC dictonary.
I have installed the dictionary to my computer.
However I can't find with file I should include into my path in order to use it with. There is the executable version, also a jar file and I extracted dictonaries however they are only available into pdf format.
What file should I use from LIWC2015 dictonary in order to use it in R?
This example code is from package but I don't have a cat file
liwc2007dict <- dictionary(file = "~/Dropbox/QUANTESS/dictionaries/LIWC/LIWC2007.cat",
format = "wordstat")
tail(liwc2007dict, 1)

You need to change the format of the file in the code in R.
Try this:
liwc2015dict <- dictionary(file = "~/Dropbox/QUANTESS/dictionaries/LIWC/LIWC2015_English_Flat.dic",
format = "LIWC")
It's documented here.

How to import multiple matlab files into R (Using package R.Matlab)

Thank you in advance for your're help. I am using R to analyse some data that is initially created in Matlab. I am using the package "R.Matlab" and it is fantastic for 1 file, but I am struggling to import multiple files.
The working script for a single file is as follows...
install.packages("R.matlab")
library(R.matlab)
x<-("folder_of_files")
path <- system.file("/home/ashley/Desktop/Save/2D Stream", package="R.matlab")
pathname <- file.path(x, "Test0000.mat")
data1 <- readMat(pathname)
And this works fantastic. The format of my files is 'Name_0000.mat' where between files the name is a constant and the 4 digits increase, but not necesserally by 1.
My attempt to load multiple files at once was along these lines...
for (i in 1:length(temp))
data1<-list()
{data1[[i]] <- readMat((get(paste(temp[i]))))}
And also in multiple other ways that included and excluded path and pathname from the loop, all of which give me the same error:
Error in get(paste(temp[i])) :
object 'Test0825.mat' not found
Where 0825 is my final file name. If you change the length of the loop it is always just the name of the final one.
I think the issue is that when it pastes the name it looks for that object, which as of yet does not exist so I need to have the pasted text in speach marks, yet I dont know how to do that.
Sorry this was such a long post....Many thanks

save file in XYZ format as vector (GML or shp)

I am using QGIS software. I would like to show value of each raster cell as label.
My idea (I don't know any plugin or any functionality from QGIS which allow to it easier) is to export raster using gdal2xyz.py into coordinates-value format and then save it as vector (GML or shapefile). For this second task, I try to use
*gdal_polygonize.py:*
gdal_polygonize.py rainfXYZ.txt rainf.shp Creating output rainf.shp of
format GML.
0...10...20...30...40...50...60...70...80...90...100 - done.
unfortunately I am unable to load created file (even if I change the extension to .gml)
ogr2ogr tool don't even recognize this format.
yes - sorry I forgot to add such information.
In general after preparing CSV file (using gdal2xyz.py with -csv option),
I need to add one line at begining of it:
"Longitude,Latitude,Value" (without the quotes)
Then I need to create a VRT file which contain
*> <OGRVRTDataSource>
> <OGRVRTLayer name="Shapefile_name">
> <SrcDataSource>Shapefile_name.csv</SrcDataSource>
> <GeometryType>wkbPoint</GeometryType>
>
> <GeometryField encoding="PointFromColumns" x="Longitude"
> y="Latitude"/>
> </OGRVRTLayer> </OGRVRTDataSource>*
Run the command "ogr2ogr -select Value Shapefile_name.shp Shapefile_name.vrt". I got the file evap_OBC.shp and two other associated files.

For the sake of archive completeness, this question has also been asked on GDAL mailing list as thread save raster as point-vector file. It seems Chaitanya provided solution for it.

Read SPSS file into R

I am trying to learn R and want to bring in an SPSS file, which I can open in SPSS.
I have tried using read.spss from foreign and spss.get from Hmisc. Both error messages are the same.
Here is my code:
## install.packages("Hmisc")
library(foreign)
## change the working directory
getwd()
setwd('C:/Documents and Settings/BTIBERT/Desktop/')
## load in the file
## ?read.spss
asq <- read.spss('ASQ2010.sav', to.data.frame=T)
And the resulting error:
Error in read.spss("ASQ2010.sav", to.data.frame = T) : error
reading system-file header In addition: Warning message: In
read.spss("ASQ2010.sav", to.data.frame = T) : ASQ2010.sav: position
0: character `\000' (
Also, I tried saving out the SPSS file as a SPSS 7 .sav file (was previously using SPSS 18).
Warning messages: 1: In read.spss("ASQ2010_test.sav", to.data.frame =
T) : ASQ2010_test.sav: Unrecognized record type 7, subtype 14
encountered in system file 2: In read.spss("ASQ2010_test.sav",
to.data.frame = T) : ASQ2010_test.sav: Unrecognized record type 7,
subtype 18 encountered in system file

I had a similar issue and solved it following a hint in read.spss help.
Using package memisc instead, you can import a portable SPSS file like this:
data <- as.data.set(spss.portable.file("filename.por"))
Similarly, for .sav files:
data <- as.data.set(spss.system.file('filename.sav'))
although in this case I seem to miss some string values, while the portable import works seamlessly. The help page for spss.portable.file claims:
The importer mechanism is more flexible and extensible than read.spss and read.dta of package "foreign", as most of the parsing of the file headers is done in R. They are also adapted to load efficiently large data sets. Most importantly, importer objects support the labels, missing.values, and descriptions, provided by this package.

The read.spss seems to be outdated a little bit, so I used package called memisc.
To get this to work do this:
install.packages("memisc")
data <- as.data.set(spss.system.file('yourfile.sav'))

You may also try this:
setwd("C:/Users/rest of your path")
library(haven)
data <- read_sav("data.sav")
and if you want to read all files from one folder:
temp <- list.files(pattern = "*.sav")
read.all <- sapply(temp, read_sav)

I know this post is old, but I also had problems loading a Qualtrics SPSS file into R. R's read.spss code came from PSPP a long time ago, and hasn't been updated in a while. (And Hmisc's code uses read.spss(), too, so no luck there.)
The good news is that PSPP 0.6.1 should read the files fine, as long as you specify a "String Width" of "Short - 255 (SPSS 12.0 and earlier)" on the "Download Data" page in Qualtrics. Read it into PSPP, save a new copy, and you should be in business. Awkward, but free.
,

You can read SPSS file from R using above solutions or the one you are currently using. Just make sure that the command is fed with the file, that it can read properly. I had same error and the problem was, SPSS could not access that file. You should make sure the file path is correct, file is accessible and it is in correct format.
library(foreign)
asq <- read.spss('ASQ2010.sav', to.data.frame=TRUE)
As far as warning message is concerned, It does not affect the data. The record type 7 is used to store features in newer SPSS software to make older SPSS software able to read new data. But does not affect data. I have used this numerous times and data is not lost.
You can also read about this at http://r.789695.n4.nabble.com/read-spss-warning-message-Unrecognized-record-type-7-subtype-18-encountered-in-system-file-td3000775.html#a3007945

It looks like the R read.spss implementation is incomplete or broken. R2.10.1 does better than R2.8.1, however. It appears that R gets upset about custom attributes in a sav file even with 2.10.1 (The latest I have). R also may not understand the character encoding field in the file, and in particular it probably does not work with SPSS Unicode files.
You might try opening the file in SPSS, deleting any custom attributes, and resaving the file.
You can see whether there are custom attributes with the SPSS command
display attributes.
If so, delete them (see VARIABLE ATTRIBUTE and DATAFILE ATTRIBUTE commands), and try again.
HTH,
Jon Peck

If you have access to SPSS, save file as .csv, hence import it with read.csv or read.table. I can't recall any problem with .sav file importing. So far it was working like a charm both with read.spss and spss.get. I reckon that spss.get will not give different results, since it depends on foreign::read.spss
Can you provide some info on SPSS/R/Hmisc/foreign version?

Another solution not mentioned here is to read SPSS data in R via ODBC. You need:
IBM SPSS Statistics Data File Driver. Standalone driver is enough.
Import SPSS data using RODBC package in R.
See the example here. However I have to admit that, there could be problems with very big data files.

For me it works well using memisc!
install.packages("memisc")
load('memisc')
Daten.Februar <-as.data.set(spss.system.file("NPS_Februar_15_Daten.sav"))
names(Daten.Februar)

I agree with #SDahm that the haven package would be the way to go. I myself have struggled a bit with string values when starting to use it, so I thought I'd share my approach on that here, too.
The "semantics" vignette has some useful information on this topic.
library(tidyverse)
library(haven)
# Some interesting information in here
vignette('semantics')
# Get data from spss file
df <- read_sav(path_to_file)
# get value labels
df <- map_df(.x = df, .f = function(x) {
if (class(x) == 'labelled') as_factor(x)
else x})
# get column names
colnames(df) <- map(.x = spss_file, .f = function(x) {attr(x, 'label')})

There is no such problem with packages you are using. The only requirement for read a spss file is to put the file into a PORTABLE format file. I mean, spss file have *.sav extension. You need to transform your spss file in a portable document that uses *.por extension.
There is more info in http://www.statmethods.net/input/importingdata.html

In my case this warning was combined with a appearance of a new variable before first column of my data with values -100, 2, 2, 2, ..., a shift in the correspondence between labels and values and the deletion of the last variable. A solution that worked was (using SPSS) to create a new dump variable in the last column of the file, fill it with random values and execute the following code:
(filename is the path to the sav file and in my case the original SPSS file had 62 columns, thus 63 with the additional dumb variable)
library(memisc)
data <- as.data.set(spss.system.file(filename))
copyofdata = data
for(i in 2:63){
names(data)[i] <- names(copyofdata)[i-1]
}
data[[1]] <- NULL
newcopyofdata = data
for(i in 2:62){
labels(data[[i]]) <- labels(newcopyofdata[[i-1]])
}
labels(data[[1]]) <- NULL
Hope the above code will help someone else.

Turn your UNICODE in SPSS off
Open SPSS without any data open and run the code below in your syntax editor
SET UNICODE OFF.
Open the data set and resave it to remove the Unicode
read.spss('yourdata.sav', to.data.frame=T) works correctly then

I just came came across an SPSS file that I couldn't get open using haven, foreign, or memisc, but readspss::read.por did the trick for me:
download.file("http://www.tcd.ie/Political_Science/elections/IMSgeneral92.zip",
"IMSgeneral92.zip")
unzip("IMSgeneral92.zip", exdir = "IMSgeneral92")
# rio, haven, foreign, memisc pkgs don't work on this file! But readspss does:
if(!require(readspss)) remotes::install_git("https://github.com/JanMarvin/readspss.git")
ims92 <- readspss::read.por("IMSgeneral92/IMS_Nov7 92.por", convert.factors = FALSE)
Nice! Thanks, #JanMarvin!

1)
I've found the program, stat-transfer, useful for importing spss and stata files into R.
It resolves the issue you mention by converting spss to R dataset. Also very useful for subsetting super large datasets into smaller portions consumable by R. Not free, but a very useful tool for working with datasets from different programs -- especially if you don't have access to them.
2)
Memisc package also has an spss function worth trying.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Convert BED file to WIG file - r

Do you know any way to convert easily (either by R, or with other program) a BED file to WIG? Can you give me some guidelines?

Take a look at the rtracklayer package and specifically at the following man pages: ?import ?export

Related

Use magrittr to pipe download.file() to another function

LIWC2015 import in r

How to import multiple matlab files into R (Using package R.Matlab)

save file in XYZ format as vector (GML or shp)

Read SPSS file into R

Categories

Resources