Welcome to MLink Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
299 views
in Technique[技术] by (71.8m points)

r - Add new column and aggregate data

I am a novice at R programming and stuck with a problem.

Here's a sample dataset:

 df <- data.frame(
      area_id = c(31,34,36,33,28,35, 31,34,36,33,28,35),
      description = c('paramount','sony','star','miramax','pixar','zee', 'paramount','sony','star','miramax','pixar','zee'),
      footfall = c(200, 354, 543, 123, 456, 634, 356, 765, 345, 235, 657, 524),
      income = c(21000, 19000, 35000, 18000, 12000, 190000, 21000, 19000, 35000, 18000, 12000, 190000),
      year = c(2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020));

Now, I have two requirements:

  1. Adding a column named "region" with values based on "area_id"; So, areas with "area_id" = 28, 34, 36 should have value as "West" in "region" column. Similarly, areas with "area_id" = 31, 33, 35 should have value as "East" in "region" column.

  2. Finally, I want a summary table stratified by year and aggregated region-wise. The final table should look like below:

enter image description here

Can anyone please help me out?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Do it like this

library(tidyverse)

west <- c(28, 34, 36)

df %>% mutate(region = case_when(area_id %in% west ~ "West",
                                 TRUE ~ "East")) %>%
  pivot_longer(cols = c(footfall, income), names_to = "Header", values_to = "val") %>%
  group_by(region, Header, year) %>% summarise(val = sum(val)) %>%
  pivot_wider(id_cols = c(region, Header), names_from = year, values_from = val) %>%
  mutate(Total = `2019` + `2020`) -> df2

# A tibble: 4 x 5
# Groups:   region, Header [4]
  region Header   `2019` `2020`  Total
  <chr>  <chr>     <dbl>  <dbl>  <dbl>
1 East   footfall    957   1115   2072
2 East   income   229000 229000 458000
3 West   footfall   1353   1767   3120
4 West   income    66000  66000 132000

If you assign the above result to say df2 and check its class

class(df2)

[1]   "tbl_df"   "tbl"     "data.frame"

which will be same as that of class(df)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to MLink Developer Q&A Community for programmer and developer-Open, Learning and Share
...