Split data frame elements with semicolon in R [duplicate] - r

This question already has answers here:
Split delimited strings in a column and insert as new rows [duplicate]
(6 answers)
Split comma-separated strings in a column into separate rows
(6 answers)
Closed 6 years ago.
I've tried to create a function that replaces semicolon-containing elements in a dataframe column with splitted entries that are places on the bottom of the column, using basic R. The main purpose is to use this function with apply and make the addition whenever detecting an entry with semicolon.
The main problem with my code is that it returns the exact same data frame without any additional values.
> df
rs2480711
rs74832092
rs4648658
rs4648659
rs61763535
rs28733941;rs67677371
>x
"rs28733941;rs67677371"
function(x){
semiCols = length(unlist(strsplit(x, ";")))
elementsRs = unlist(strsplit(x, ";"))
if(semiCols>1){
for(i in 1:semiCols){
df = rbind(df, elementsRs[i])
}}}
I would also like to know how can I expand the code in order to split rows based on one value leaving all the others unchanged. For example, this
>df
0 rs61763535 T1
1 rs28733941;rs67677371 T2
will look like this
>df2
0 rs61763535 T1
1 rs28733941 T2
1 rs67677371 T2

If I understood correctly, this will work
unlist(strsplit(as.character(df$V1),split = ";"))
Again, I couldn't get you properly. But, maybe you are looking for this
apply(df,2,function(t) unlist(strsplit(as.character(t),split = ";")))

Related

Trouble converting Values in Column into Row Names of Data Frame in R [duplicate]

This question already has answers here:
Why am I getting X. in my column names when reading a data frame?
(5 answers)
data.frame without ruining column names
(2 answers)
Closed last month.
I am trying to convert the first column of a data frame as Row names.
It works fine but the names of the data frame format changes!
It changes from like 100-21-0 to X100.21.0
First column is Character values: Code, CBT, DQY, DQX etc.
and the names (or the first row?) of the data frame (double) like: Code, 100-21-0, 1002-84-2, 100-47-0 etc.
Code
100-21-0
CBT
0
DQY
1
I am using the code similar to:
newdataframe <- data.frame(dataframe, row.names = 1)
It works fine but the names of the data frame change from 100-21-0, 1002-84-2, 100-47-0 to X100.21.0, X1002.84.2, X100.47.0 !!!
I am confused why? Can anyone help on this?

Complex string vector creation from data frame values [duplicate]

This question already has an answer here:
Convert R vector to string vector of 1 element [duplicate]
(1 answer)
Closed 2 years ago.
I have the following dataframe which has only one column called Values and a list of string values:
Values
AA
AB
CG
DS
KI
Is there a simple way to create a string vector with each value separated by |?
The desired resulting output should look something like this:
Vector = "AA|AB|CG|DS|KI"
Cheers!
Sure, you simply use (assuming your data is called df):
paste0(df$Values, collapse = "|")

Render rows that do not have zero in any columns in R [duplicate]

This question already has answers here:
How to remove rows with 0 values using R
(2 answers)
Closed 2 years ago.
I searched many questions that were suggested by the stackoverflow before posting this question but I couldn't find what I was looking for, I decided to ask here, I have data file:
https://github.com/learnseq/learning/blob/main/GSE133399_Fig2_FPKM.csv
The file has 9 columns, first column has names, the other 8 columns have values, I want to render into an object all columns that do not have zero and save the in csv format.
I had a look on your data set: it contains some rows having all values zero, except the identifier. I assume you want to omit the lines being full of zero's. This code does the job:
data1 = read.csv("GSE133399_Fig2_FPKM.csv")
## Apply <all> on each row.
allZero = apply(data1[, -1] == 0, 1, all)
data2 = data1[!allZero, ]
Now, data2 is the same as data1, but without the rows having only zeros.

Split a column into two values in R [duplicate]

This question already has answers here:
Split data frame string column into multiple columns
(16 answers)
How to split column into two in R using separate [duplicate]
(3 answers)
Closed 4 years ago.
I want to split a dataframe containing only 1 column with below values into two columns containing only the numeric values
1: [0.426321243245,0.573678756755]
2: [0.413189679382,0.586810320618]
I have tried different ways in R using dplr -starts_with,seperate etc but couldn't split the column into dataframe containing two seperate columns.
Can someone please help me with this?
Thanks,
I hope this will help you
newdf <- read.table(text = "column1
0.426321243245,0.573678756755
0.413189679382,0.586810320618
", header = T)
library(splitstackshape)
final <- cSplit(newdf, 'column1', sep=",", type.convert=FALSE)

substracting blank spaces into a separate column [duplicate]

This question already has answers here:
Split data frame string column into multiple columns
(16 answers)
Split different lengths values and bind to columns
(2 answers)
Closed 5 years ago.
I have a column of alphanumeric data containing data that looks like this K*01:01+K*01:02:01:01, K*77:01:08+K*01:02:01:22, K*10:01:77. I want to separate the data based on the strings before and after the + sign, new data should be displayed in 2 separate columns. I want my output to look like this:
column1 = K*01:01, K*77:01:08, K*10:01:77
column2 = K*01:02, K*01:02:01:02
I tried mydata$column1 <- sub("(.?)\+.", "\1", mydata$merged) works fine but when I used mydata$column2 <- sub(".\+(.?)", "\1", mydata$merged) for ID 3 K*10:01:77 is extracted both in columns 1 and 2, and I want column 2 to display a blank/empty cell for strings that do not have the + delimiter. Also I want the new columns to appear in the current data frame adjacent to the original merged column so packages like stringer do not work.

Resources