Generating frequencies from CSV file - r

I want to generate a frequency table of values, but so far I only found how to do that by making classes, I want non-grouped values. Let's say i have:
values <- c(1,2,5,6,3,4,3,2,6,7)
how to generate a frequency table out of that?

Take a look at table
> tab <- table(values)
values
1 2 3 4 5 6 7
1 2 2 1 1 2 1
if you prefer a data.frame
> as.data.frame(tab)
values Freq
1 1 1
2 2 2
3 3 2
4 4 1
5 5 1
6 6 2
7 7 1
Plotting:
hist(values) # histogram of `values`
plot(tab) # plot of `tab`, table of frequencies
barplot(tab) # plot of `tab`, table of frequencies

table(values)
values
1 2 3 4 5 6 7
1 2 2 1 1 2 1

Related

Convert the frequencies of a list elements (table) to data frame in R

I have a list like this:
x = c(0,0,1,1,2,3,1,0,4,5,6,4,3,2,1,1,0,2,3)
and I need to create a dataframe with frequencies, where col names are the unique elements of my list,and the row contains the frequencies
If I call
table(x)
I get what I want but it's not dataframe
x
0 1 2 3 4 5 6
4 5 3 3 2 1 1
I would like to have a data frame like this:
> mydf
0 1 2 3 4 5 6
1 4 5 3 3 2 1 1
A bit excessive but
mydf <- as.data.frame(t(as.matrix(table(x))))
gives
> mydf
0 1 2 3 4 5 6
1 4 5 3 3 2 1 1

Find minimal value for a multiple same keys in table [duplicate]

This question already has answers here:
Extract row corresponding to minimum value of a variable by group
(9 answers)
Closed 5 years ago.
I have a table which contains multiple rows of the different data for a key of multiple columns.
Table looks like this:
A B C
1 1 1 2
2 1 1 3
3 2 1 4
4 1 2 4
5 2 2 3
6 2 3 1
7 2 3 2
8 2 3 2
I also discovered how to remove all of the duplicate elements using unique command for multiple colums, so the data duplication is not a problem.
I would like to know how to for every key(columns A and B in example) in the table to find only the minimum value in third column(C column in table)
At the end table should look like this
A B C
1 1 1 2
3 2 1 4
4 1 2 4
5 2 2 3
6 2 3 1
Thanks for any help. It is really appreciated
In any question, feel free to ask
con <- textConnection(" A B C
1 1 1 2
2 1 1 3
3 2 1 4
4 1 2 4
5 2 2 3
6 2 3 1
7 2 3 2
8 2 3 2")
df <- read.table(con, header = T)
df[with(df, order(A, B, C)), ]
df[!duplicated(df[1:2]),]
# A B C
# 1 1 1 2
# 3 2 1 4
# 4 1 2 4
# 5 2 2 3
# 6 2 3 1

Possible to arrange observations in groups of N that reflect data set proportions using R?

Are there are functions in R that arrange observations in groups of N that reflect, as closely as possible, the data set proportions of certain variables?
For example, if I have a data set with 8 observations and two variables each with two levels with data set proportions as follows:
Var1 Var2
1 0.5 0.5
2 0.5 0.5
Are there any functions that would enable me to optimally sample from the data set to say create groups of 2 observations that reflect the above data set proportions?
Example data:
Data <- read.table(text=" Obs Var1 Var2
1 1 1
2 1 2
3 2 1
4 2 2
5 1 1
6 1 2
7 2 1
8 2 2 ", header=T)
Desired Result:
Result <- read.table(text=" Obs Var1 Var2 Group_ID
1 1 1 1
4 2 2 1
2 1 2 2
3 2 1 2
5 1 1 3
7 2 1 3
6 1 2 4
8 2 2 4 ", header=T)
Not that all groups have proportions of .5 for each level of each variable.

Pie chart of frequency counts

I've imported a 1-column excel file using gdata, the data is as follows
3 4 3 3 1 4 1 3 2 3 1 1 4 2 3 3 2 6 1 1 3 3 2 2 2 2 1 3 2 1 6 1 3 2 2 1 2 2 4 2
I'm using the pie(md[, 1]) command to create a pie chart for the data, however, I'm getting the following chart when I do this:
.
It's taking the data as 1-40 and then creating the pie width to the data sample rather than having 5 segments (1,2,3,4,6) with width created by the amount of times the result appears, i.e. the frequency counts of unique elements in the vector. How can I achieve that?
Use the ?table function to compute frequencies before applying pie:
table(x)
#x
# 1 2 3 4 6
#10 13 11 4 2
Then, to produce the pie chart of frequencies:
pie(table(x))
produces:
x <- scan(text = "3 4 3 3 1 4 1 3 2 3 1 1 4 2 3 3 2 6 1 1 3 3 2 2 2 2 1 3 2 1 6 1 3 2 2 1 2 2 4 2")

changing values in dataframe in R based on criteria

I have a data frame that looks like
> mydata
ID Observation X
1 1 3
1 2 3
1 3 3
1 4 3
2 1 4
2 2 4
3 1 8
3 2 8
3 3 8
I have some code that counts the number of observations per ID, determines which IDs have a number of observations that meet a certain criteria (in this case, >=3 observations), and returns a vector with these IDs:
> vals
[1] 1 3
Now I want to manipulate the X values associated with these IDs, e.g. by adding 1 to each value, giving a data frame like this:
> mydata
ID Observation X
1 1 4
1 2 4
1 3 4
1 4 4
2 1 4
2 2 4
3 1 9
3 2 9
3 3 9
I'm pretty new to R and am uncertain how I might do this. It might help to know that X is constant for each ID.
The call mydata$ID %in% vals returns TRUE or FALSE to indicate whether the ID value for each row is in the vals vector. When you add this to the data currently in mydata$X, the TRUE and FALSE are converted to 1 and 0, respectively, yielding the desired result:
mydata$X <- mydata$X + mydata$ID %in% vals
# mydata
# ID Observation X
# 1 1 1 4
# 2 1 2 4
# 3 1 3 4
# 4 1 4 4
# 5 2 1 4
# 6 2 2 4
# 7 3 1 9
# 8 3 2 9
# 9 3 3 9

Resources