Exporting data.frame in R into a .txt file - r

I have a data.frame that has 15 columns and looks like the following:
Word Syllable TimeStart TimeEnd Duration PitchMin PitchMax TimePitchMin
Einen "aI 0.00 0.11 0.11 98.173 106.158 0.053
Einen n#n 0.11 0.24 0.13 106.158 123.176 0.110
TimePitchMax PitchSlope IntenMax IntenMin TimeIntenMax TimeIntenMin PitchAccent
0.110 140.443 83.794 82.583 0.095 0.051 no
0.210 169.359 83.875 80.458 0.210 0.234 no
I want to save the data into a .txt file. But when I use standard write.table(table, "outfile.txt") method the result looks like a mess.
What appropriate arguments can be used to solve this problem?
EDIT:
The print screen of the mess output:

What happens if you use write.table(table, "outfile.txt", sep="\t", row.names=FALSE)? That should help you create a tab-delimited text file.
If the output still looks like a mess, you can export your file as a csv with write.csv(table, "outfile.txt", row.names=FALSE).

Did you check the structure of your table with str(table) before you export? It looks like the table may contain some corrupt variable names and/or variable, which may un turn cause export problems. In an ideal case, when you do str(table), you should see that the table object is a data.frame (or tibble) with proper variable names and values. If you see variable names like """ or c(9,11,11, ...) etc., that's a signal that your problem is with how you create table, not how you export it.

Related

How can I see the proportion for each answer while making it a tibble?

I'm trying to create a tibble that will allow me to see the proportion for each answer in one of my variables.
Currently, my code looks like this
drinkchoice <- tibble(prop.table(table(surveyq$drink_choice)))
when running the code, it returns the proportion of each answer in the variable but does not list the answers that come with it. For example it returns a table looking like:
0.007
0.04
0.29
0.13
0.09
but when I remove tibble() from the original line of code, it responds back with
pepsi 0.007
fanta 0.04
sprite 0.29
brisk 0.13
coke 0.09
I was wondering if there is any way to code it so even when using the tibble() function, I can keep it so when the code runs it returns me back the answer with the correct proportion with it.
Edit: I added the line ~ tibble::rownames_to_column("Drink") and I was wondering whether it is possible to rewrite the numbers under the new column? As it would solve my problem

Error in correlation matrix of a data frame

R version 3.5.1.
I want to create a correlation matrix with a Data frame that goes like this:
BGI WINDSPEED SLOPE
4277.2 4.23 7.54
4139.8 5.25 8.63
4652.9 3.59 6.54
3942.6 4.42 10.05
I put that as an example but I have more than 20 columns and over 40,000 entries.
I tried using the following codes
corrplot(cor(Site.2),method = "color",outline="white")
Matrix_Site=cor(Site.2,method=c("spearman"))
but every time, the same warning appears:
Error in cor(Site.2) : 'x' must be numeric
I would like to correlate every variable of the data frame with each other and create a graph and a table with it, similar to this.

R Printing specific columns

I have this file test.csv. I have used -
test <- read.csv ("test.csv", check.names=FALSE)
To get it into R. I have used check.names as the column headers contains brackets and if I dont use it, they turn into periods which I have issues with when coding.
I have then done this-
sink(file='interest.txt')
print((test["test$log(I)">=1 & test$number >= 6 , "Name"]),)
My aim is to create a sink file so the print output is put into there. I wanted to print the value in the name column if the values for 2 columns (log(I) and number) equal a certain value.
log(I) Number Name
1.00 6 LAMP1
3.50 6 MND1
1.20 2 GGD3
0.98 7 KLP1
So in this example, the code would output just LAMP1 and MND1 to the sink file I created.
My issue is that I don't think R is recognising that log(I) is the header title as it seems to give me the same result with or without this part included.
If I dont use
check.names=FALSE
Then the column is turned to log.I. instead. How can I get around this issue?
Thanks

How to save variables & stop variables from being overwritten in R?

So I have a bunch of functions that save the column numbers of my data. So for example, my data looks something like:
>MergedData
[[1]]
Date EUR.HIGH EUR.LOW EUR.CLOSE EUR.OPEN EUR.LAST
01/01/16 1.00 1.00 1.25 1.30 1.24
[[2]]
Date AUD.HIGH AUD.LOW AUD.CLOSE AUD.OPEN AUD.LAST
01/01/16 1.00 1.00 1.25 1.30 1.24
I have 29 of the above currencies. So in this case, MergedData[[1]] will return all of my Euro prices, and so on for 29 currencies.
I also have a function in R that calculates the variables and saves the numbers 1 to 29 that correspond with the currencies. This code calculates values in the first row of my data, i.e:
trending <- intersect(which(!ma.sig[1,]==0), which(!pricebreak[1,]==0))
which returns something like:
>sig.nt
[1] 1 2 5...
And so I can use this to pull up 'trending' currencies via a for() function:
for (i in length(sig.nt){
MergedData[[i]]
...misc. code for calculations of trending currencies...
}
I want to be able to 'save' my trending currencies for future references and calculations. The problem is sig.nt variable changes with every new row. I was thinking of using the lockBinding command:
sig.exist <- sig.nt #saves existing trend
lockBinding('curexist', .GlobalEnv)
But wouldn't this still get overwritten everytime I run my script? Help would be much appreciated!

plotting histogram by a data frame

I am a new user of R and I am running the last 7 days this language using the mixdist package for the modal analysis of finite mixture distributions. I am working on nanoparticles thus the R is for the analysis of particle size distributions recorded by a particle analyser I am using to my experiments.
My problem is illustrated below:
Firstly I am collecting my data from excel (raw data)
Diameter dN/dlog(dp) frequencies
4.87 1825.078136 0.001541926
5.62 2363.940947 0.001997187
6.49 2022.259831 0.001708516
7.5 1136.653264 0.000960307
8.66 363.4570006 0.000307068
10 255.6702845 0.000216004
11.55 241.6525906 0.000204161
13.34 410.3425535 0.00034668
15.4 886.929307 0.000749327
17.78 936.4632499 0.000791176
20.54 579.7940281 0.000489842
23.71 11.915522 0.00001
27.38 0 0
31.62 0 0
36.52 5172.088 0.004369665
42.17 19455.13684 0.01643677
48.7 42857.20502 0.036208126
56.23 68085.64903 0.057522504
64.94 87135.1959 0.07361661
74.99 96708.55662 0.081704712
86.6 97982.18946 0.082780747
100 95617.46266 0.080782896
115.48 93732.08861 0.079190028
133.35 93718.2981 0.079178377
153.99 92982.3002 0.078556565
177.83 88545.18227 0.074807844
205.35 78231.4116 0.066094203
237.14 63261.43349 0.053446741
273.84 46759.77702 0.039505233
316.23 32196.42834 0.027201315
365.17 21586.84472 0.018237755
421.7 14703.9162 0.012422678
486.97 10539.84662 0.008904643
562.34 7986.233881 0.00674721
649.38 6133.971913 0.005182317
749.89 4500.351801 0.003802145
865.96 2960.469207 0.002501167
1000 1649.858041 0.001393891
Inf 0 0
using the function
pikraw<-read.table(file="clipboard", sep="\t", header=T)
After importing the data in R I am choosing the 1st and the 3rd column of the above table :
diameter<- pikraw$Diameter
frequencies<-pikraw[3]
Then I am grouping my data using the functions
pikgrp <- data.frame(length =diameter, freq =frequencies)
class(pikgrp) <- c("mixdata", "data.frame")
Doing all these I am going to plot the histogram of this data
plot(pikgrp,log="x")
and there something strange happens: The horizontal axis and the values on this look fine although the y axis of the graph appear the low values of the frequencies as they are and the high values with a cut decimal lowering the plot.
Have you got any explanation on what is happening? Probably the answer could be very simple although after exhausting my self and losing a whole weekend I believe that I have all the rights on my side.
It looks to me like you are reading your data wrong. Try this:
pikraw <- read.table(file="clipboard", sep="", header=T)
That is, change the sep argument to sep="". Everything worked fine from there.
Also, note that using the clipboard as file argument only works if you have your data on the clipboard. I recommend creating a .txt (or .csv) file with your data. That way you don't have to have your data on the clipboard everytime you want to read it.

Resources