Use Group_by to find the percentage in the subgroup and summarize

I am new to dplyr and tried to make the following conversion without luck. I have searched on the internet and I found an example of doing the same in ddply, but I want to use dplyr .

I have the following data:

month type count
1 Feb-14 bbb 341
2 Feb- 14 ccc 527
3 Feb-14 aaa 2674
4 Mar-14 bbb 811
5 Mar-14 ccc 1045
6 Mar-14 aaa 4417
7 Apr- 14 bbb 1178
8 Apr-14 ccc 1192
9 Apr-14 aaa 4793
10 May-14 bbb 916
.. ... ... ...

I want to use dplyr to calculate the percentage of each type (aaa, bbb, ccc) at a month level, i.e.

month type count per
1 Feb-14 bbb 341 9.6%
2 Feb-14 ccc 527 14.87%
3 Feb-14 aaa 2674 ..
.. ... ... ...

I tried it

data %>%
group_by(month, type) %>%
summarise(count / sum(count ))

This gives each value 1. How to calculate the sum (count) of all the types in this month?

Try

library( dplyr)
data %>%
group_by(month) %>%
mutate(countT= sum(count)) %>%
group_by(type, add=TRUE)% >%
mutate(per=paste0(round(100*count/countT,2),'%'))

We can also use left_join after summing up (month) sum (count)

Or use the data.table option.

library(data.table)
setkey(setDT(data), month)[data [, list(count=sum(count)), month],
per:= paste0(round(100*count/i.count,2),'%')][]

I am new to dplyr and tried to make the following conversion without luck. I searched on the internet and I found an example of doing the same in ddply, but I want to use dplyr. < p>

I have the following data:

month type count
1 Feb-14 bbb 341
2 Feb-14 ccc 527
3 Feb-14 aaa 2674
4 Mar-14 bbb 811
5 Mar-14 ccc 1045
6 Mar-14 aaa 4417
7 Apr-14 bbb 1178
8 Apr-14 ccc 1192
9 Apr-14 aaa 4793
10 May-14 bbb 916
.. ... ... ...

I want to use dplyr to calculate each type (aa a, bbb, ccc), that is

month type count per
1 Feb-14 bbb 341 9.6%
2 Feb-14 ccc 527 14.87%
3 Feb-14 aaa 2674 ..
.. ... ...

I tried it

data %>%
group_by(month, type) %>%
summarise(count / sum(count))

This gives each value 1. How to Calculate the sum (count) of all types of this month?

Try

library(dplyr)
data %>%
group_by(month) %>%
mutate(countT= sum(count)) %>%
group_by(type, add=TRUE) %>%
mutate(per=paste0( round(100*count/countT,2),'%'))

We can also use left_join after summing up (month) sum (count)

or use data.table Option.

library(data.table)
setkey(setDT(data), month)[data[, list(count=sum(count)), month],
per:= paste0(round(100*count/i.count,2),'%')][]

Leave a Comment

Your email address will not be published.