Welcome to MLink Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
295 views
in Technique[技术] by (71.8m points)

R base line plot, visualize all countries seperatly with mfrow

Hello stackoverflow Community,

I want to create a line chart for all countries in the data set (x = year, y = BMI). I just want to use R base for visualization. The problem is that R generates the visualization seperatly for each country. I want one visulization for all countries with seperate margins for each country within the visualization.

Thank you for helping.

Dataset: https://github.com/tanaytuncer/LifeExpectancy_BMI Code:

path2 <- "/Users/tanaytuncer/Desktop/Quantitative Datenanalyse/BMI.csv"
data <- read.csv(path2, check.names = FALSE)
data <- data[-1:-3, ]
names(data)[1] <- "country"

data <-  data %>%
  mutate(across(-country, parse_number)) %>%
  gather("year", "BMI", 2:17)


df_BMI4 <- data %>%
  select(country, BMI, year)
View(df_BMI4)

par(mfrow=c(50,4), mar(4, 3, 3, 1))
for (i in df_BMI4$country) {
  country <- subset(df_BMI4, country == i)
  plot(country$year, country$BMI, type="l", main = i, add = TRUE)
} 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Your data is in a character format. To get the average value and confidence bounds you may split the strings of X at appropriate patterns and convert them to numeric format. Note, however that you have 195 countries which would make the plot unreadable, I'll show you the way on a subset.

After reshaping your data into long format dl (I use reshape here where you used tidyr::gather), there are some "No data" values which we first want to mark as NA.

dl <- `rownames<-`(reshape(d, idvar="country", varying=2:17, direction="long", sep="", 
              timevar="year"), NULL)

dl$X <- ifelse(dl$X == "No data", NA, dl$X)

Then we split the strings on "[" or "]" or "-" using a regular expression "\[|\]|-" in strsplit. This gives a list of each three elements which we want to rbind and type.convert from "character" to "numeric": also we set proper names using setNames. The result we cbind to the first two columns of our long data set.

num <- setNames(type.convert(do.call(rbind.data.frame, strsplit(dl$X, " \[|\]|-"))),
         c("bmi", "lo", "up"))
dl <- cbind(dl[1:2], num)[order(dl$country, dl$year), ]

Now we extract some values we need, unique countries, years and the range.

cy <- unique(dl$country)
yr <- unique(dl$year)
rg <- range(dl[3:5], na.rm=T)

This subsets the countries from 195 to 35 for demonstration purposes:

cy <- cy[1:(7*5)]

Finally we use matplot in an sapply..

x11()  ## opens a window
op <- par(mfrow=c(7, 5), mar=c(4, 4, 3, 1))
sapply(cy, function(x) {
  matplot(dl[dl$country %in% x, 3:5], type="l", lty=c(1, 2, 2), col=4, lwd=2,
          main=x, xlab="year", ylab="BMI", xaxt="n", ylim=rg)
  axis(1, at=axTicks(1), labels=yr[axTicks(1)])
})
par(op)

You may want to put this into a png or pdf as shown in this answer.

Result

enter image description here


Data:

d <- read.csv("https://raw.githubusercontent.com/tanaytuncer/LifeExpectancy_BMI/main/BMI.csv")[-(1:3), ]
names(d)[1] <- "country"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to MLink Developer Q&A Community for programmer and developer-Open, Learning and Share
...