I would like to use R tibbles in VS code but am seeing odd character formatting in tibble-output.
Take the penguins dataset from the palmgerpenguins package. The raw .csv looks like this:
"","species","island","bill_length_mm","bill_depth_mm","flipper_length_mm","body_mass_g","sex","year"
"1","Adelie","Torgersen",39.1,18.7,181,3750,"male",2007
"2","Adelie","Torgersen",39.5,17.4,186,3800,"female",2007
"3","Adelie","Torgersen",40.3,18,195,3250,"female",2007
"4","Adelie","Torgersen",NA,NA,NA,NA,NA,2007
"5","Adelie","Torgersen",36.7,19.3,193,3450,"female",2007
"6","Adelie","Torgersen",39.3,20.6,190,3650,"male",2007
When using R with VS Code, the output looks likes this:
library(palmerpenguins)
head(penguins)
# A tibble: 6 × 8
species island bill_length_mm bill_depth_mm flipper_l…¹ body_…² sex year
<fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
4 Adelie Torgersen NA NA NA NA NA 2007
5 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
# … with abbreviated variable names ¹​flipper_length_mm, ²​body_mass_g
This issue is only present on my work computer. My personal computer prints the tibble in VS code with the correct formatting.
I suspect the issue revolves around character encoding but I'm not sure what setting needs to be changed. My encoding settings in VS code are shown below. Any guidance on what features would need to be changed is much appreciated.
Related
I want my Rmarkdown, when converted to .md, text chunk output to be wrapped in code ticks (``` * ```).
For example as it is now, an Rmarkdown document like so:
---
title: 'This is a test title'
date: '`r Sys.Date()`'
output:
md_document:
variant: commonmark #or gfm
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE
)
```
```{r echo=FALSE}
library(palmerpenguins)
```
```{r echo=TRUE}
penguins
```
This is some text.
Rendered with rmarkdown::render("path/to/test.Rmd") to:
```
penguins
```
## # A tibble: 344 × 8
## species island bill_le…¹ bill_…² flipp…³ body_…⁴ sex year
## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
## 2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
## 3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
## 4 Adelie Torgersen NA NA NA NA <NA> 2007
## 5 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
## 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
## 7 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
## 8 Adelie Torgersen 39.2 19.6 195 4675 male 2007
## 9 Adelie Torgersen 34.1 18.1 193 3475 <NA> 2007
## 10 Adelie Torgersen 42 20.2 190 4250 <NA> 2007
## # … with 334 more rows, and abbreviated variable names
## # ¹bill_length_mm, ²bill_depth_mm, ³flipper_length_mm,
## # ⁴body_mass_g
This is some text.
How do you get the table (penguins) that is output in the .md document to be wrapped in code ticks (```)?
At the moment, I can get it to work if I use:
---
output:
html_document:
keep_md: TRUE
---
In this example the .md that is generated and kept has all text output surrounded by code ticks. How do I get this without writing an html document?
I've tried updating the s3 object knit_print() but can't figure out how to get it to work. I've also tried various flavors of markdown and looked at pandoc add ons but can't figure it out. I've been googling for hours please help.
A simple approach is to provide a class to class.output chunk option, then the chunk output will be wrapped inside codeticks (triple backticks) automatically.
And to have this behavior for all output, add class.output="output" to knitr::opts_chunk$set.
---
title: 'This is a test title'
date: '`r Sys.Date()`'
output:
md_document:
variant: commonmark #or gfm
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE,
class.output="output"
)
```
```{r echo=FALSE}
library(palmerpenguins)
```
```{r echo=TRUE}
penguins
```
```{r}
1 + 1
```
This is some text.
md output
``` r
penguins
```
``` output
## # A tibble: 344 × 8
## species island bill_length_mm bill_depth_mm flipper_…¹ body_…² sex year
## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
## 2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
## 3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
## 4 Adelie Torgersen NA NA NA NA <NA> 2007
## 5 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
## 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
## 7 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
## 8 Adelie Torgersen 39.2 19.6 195 4675 male 2007
## 9 Adelie Torgersen 34.1 18.1 193 3475 <NA> 2007
## 10 Adelie Torgersen 42 20.2 190 4250 <NA> 2007
## # … with 334 more rows, and abbreviated variable names ¹flipper_length_mm,
## # ²body_mass_g
```
``` r
1 + 1
```
``` output
## [1] 2
```
This is some text.
Feels a bit hacky, but you can cat the backticks as output to code chunks, but drop the comment symbol (##). (I'm using four backticks for the code blocks below to get the syntax highlighting correct)
````{r echo=FALSE, comment=NA}
cat('```')
````
````{r echo=FALSE}
penguins
````
````{r echo=FALSE, comment=NA}
cat('```')
````
I am having trouble writing a formula in R that allows me to output only rows that contain "N/A". I assuming filter_all would be included since this would be applied to all of the columns in the dataset but please let me know!
filter_all is deprecated. We can use filter with if_all
library(dplyr)
df1 %>%
filter(if_all(everything(), is.na))
If we are using the penguins dataset, not all columns have NAs
library(palmerpenguins)
data(penguins)
> colSums(is.na(penguins))
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
0 0 2 2 2 2 11
year
0
i.e. 'species', 'island', 'year' have 0 NAs, so the above code with if_all returns 0 rows as a single row doesn't have all NA for all the columns. We may need if_any
penguins %>%
filter(if_any(everything(), is.na))
# A tibble: 11 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
<fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
1 Adelie Torgersen NA NA NA NA <NA> 2007
2 Adelie Torgersen 34.1 18.1 193 3475 <NA> 2007
3 Adelie Torgersen 42 20.2 190 4250 <NA> 2007
4 Adelie Torgersen 37.8 17.1 186 3300 <NA> 2007
5 Adelie Torgersen 37.8 17.3 180 3700 <NA> 2007
6 Adelie Dream 37.5 18.9 179 2975 <NA> 2007
7 Gentoo Biscoe 44.5 14.3 216 4100 <NA> 2007
8 Gentoo Biscoe 46.2 14.4 214 4650 <NA> 2008
9 Gentoo Biscoe 47.3 13.8 216 4725 <NA> 2009
10 Gentoo Biscoe 44.5 15.7 217 4875 <NA> 2009
11 Gentoo Biscoe NA NA NA NA <NA> 2009
Or if we want to check columns where there are at least one NA and returns the rows where they are all NA
penguins %>%
filter(if_all(where(~ any(is.na(.x))), is.na))
# A tibble: 2 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
<fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
1 Adelie Torgersen NA NA NA NA <NA> 2007
2 Gentoo Biscoe NA NA NA NA <NA> 2009
This question already has answers here:
How to reshape data from long to wide format
(14 answers)
Closed 11 months ago.
I have the following data set containing the quantity of phytosanitary products purchased per zip code in france between 2015 and 2019 with its classification (other,toxic,mineral,organic).
the dataframe looks like this, so with the zip_code, the year and the classification you can see the quantity that was purchased
zip_code
year
classification
total_quantity
01000
2015
other
44.305436
01000
2015
toxic
212.783330
01000
2015
mineral
value
01000
2015
organic
value
01000
2016
other
value
01000
2016
toxic
value
01000
2016
mineral
value
it follows the same pattern .....
zip_code
year
classification
total_quantity
01000
2019
organic
value
01090
2015
other
value
but I would like something where you have only one entry per zip code like this (of course going to 2019 and not stoping at 2016 like i did in my exemple)
zip_code
other_total-quantity-2015
Toxic_total-quantity-2015
Mineral_total-quantity-2015
organic_total-quantity-2015
other_total-quantity-2016
Toxic_total-quantity-2016
01000
value
value
value
value
value
value
01090
value
value
value
value
value
I tried to do this using the reshape function but the closest i got from what i want is a table where the zip_code is repeated 4 times (for every classification).
Thank you
The following uses pivot_wider from package tidyr to do the reshape. I'm aware that's a personal preference, but maybe it's helpful though.
library(tidyr)
library(dplyr)
## or install and load these and related packages
## in bulk through the `tidyverse` package
df %>%
pivot_wider(
id_cols = zip_code,
names_from = c(year, classification),
values_from = total_quantity,
names_prefix = 'total-quantity' ## redundant, actually
)
I created a sample dataset that looks like this:
# A tibble: 40 × 4
zip_code year classification total_quantity
<dbl> <dbl> <chr> <dbl>
1 1000 2015 other 61.1
2 1000 2015 toxic 32.8
3 1000 2015 mineral 11.4
4 1000 2015 organic 38.9
5 1000 2016 other 18.8
6 1000 2016 toxic 65.0
7 1000 2016 mineral 0.382
8 1000 2016 organic 18.8
9 1000 2017 other 96.0
10 1000 2017 toxic 60.4
# … with 30 more rows
If you run the following code you will get your requested table:
library(tidyr)
library(dplyr)
df %>%
pivot_wider(
id_cols = zip_code,
names_from = c(year, classification),
values_from = total_quantity,
names_glue = "{classification}_total-quantity-{year}"
)
Output:
# A tibble: 2 × 21
zip_code `other_total-quantity…` `toxic_total-q…` `mineral_total…` `organic_total…` `other_total-q…` `toxic_total-q…` `mineral_total…` `organic_total…` `other_total-q…` `toxic_total-q…` `mineral_total…` `organic_total…` `other_total-q…` `toxic_total-q…` `mineral_total…` `organic_total…` `other_total-q…` `toxic_total-q…` `mineral_total…` `organic_total…`
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1000 61.1 32.8 11.4 38.9 18.8 65.0 0.382 18.8 96.0 60.4 80.4 81.2 47.4 87.4 52.2 9.65 19.7 11.3 45.7 12.8
2 1090 75.2 40.1 47.9 10.3 86.2 97.9 11.2 93.3 55.0 88.5 63.5 46.6 5.30 13.1 20.4 83.9 58.6 61.3 6.56 46.7
As you can see the year and classification are added to the columnnames using names_glue in the pivot_wider function.
I can't upload the file into stackoverflow but I have a PDF containing a table spanning 3 pages. After using library(pdftools) and pdf_text(), it creates a 3 element character list where each element is a long string of all text from each page.
library(pdftools)
df <- pdf_text(file.pdf)
The data I need is on the 2nd page. I get the output:
df[2]
All Households 19,015 10,030 8,985 3,635 585 3,055 19.1 5.8 34.0\n\nHousing above standards 12,365 8,225 4,145 0 0 0 0.0 0.0 0.0\n\nBelow one or more housing standards 6,650 1,805 4,845 3,640 585 3,055 54.7 32.4 63.1\n\nBelow affordability standard12 4,885 1,230 3,660 3,125 535 2,590 64.0 43.5 70.8\n\nBelow adequacy standard13 1,360 555 810 425 75 350 31.2 13.5 43.2\n\n\n\n\n
I want to isolate the row "Below one or more housing standards" and the 8th column which contains the value "54.7".
I believe the next steps are to split the long string into lines by the line break character "\n", identify the applicable line, split the line into words, and select the 8th word.
I've tried splitting into lines using:
library(stringr)
lines <- df[2] %>% str_split("\n")
It returns a "List of 1" and I'm not sure how to work with it. Any suggestions on the syntax?
It's a bit convoluted to get to the original file.
https://www03.cmhc-schl.gc.ca/hmip-pimh/en/#Profile/126504/5/Alta%20Vista
Core Housing Need -> Full Report -> Export.
Oddly there isn't a way to just download a CSV.
Use readLines (which doesn't use the scan(text= ...) pathway and therefore needs textConnection.
library(pdftools)
#Using poppler version 0.62.0
df <- pdf_text("Downloads/TableExport.pdf")
str(df)
# chr [1:3] "Core Housing Need (2016 Statistics Canada's Census) — Alta Vista\n H "| __truncated__ ...
# for each page read in with readLines to make character vectors
# separated by \n
lines <- lapply(df, function(t) readLines( textConnection(t)) )
Then search for the line with the target:
lines[[2]][grep("Below one or more housing standards", lines[[2]])]
[1] "Below one or more housing standards 6,650 1,805 4,845 3,640 585 3,055 54.7 32.4 63.1"
If you assigned that value to the name target you could get the 8th column with this rather baroque regex:
sub("(Below one or more housing standards)([ ]*\\d*[,]*\\d*){6}[ ]*(\\d*[.]*\\d*)(.*)", "\\3", target)
#[1] "54.7"
Notice the need to allow commas and decimal points in the numeric specifications. As written it may not be totally general since the first six of the numeric columns are only allowed to have commas and not decimals. I guess you could allow a character class like "[.,]" to be more general. Or even: "([ ]*\\d*[,]*\\d+[.]*\\d*){6}" (lightly tested). I suspect there are packages that will handle tabular pdf formatting in a more principled manner.
This does not use pdftools, but I hope it is helpful to you. First, use rvest package to read the url of this table, then use html_table to extract into a table. Then, there is some manual manipulation
library(tidyverse)
library(rvest)
url = "https://www03.cmhc-schl.gc.ca/hmip-pimh/en/Profile/DetailsCoreHousingNeed?geographyId=126504&t=5"
# Read the url
doc = rvest::read_html(url)
# Extract the table, and provide anonymous V<x> names
table = rvest::html_table(doc)[[1]]
names(table) = paste0("V",1:ncol(table))
# drop first three rows
table <- table %>% filter(row_number()>2)
# Manually, identify the split rows (i.e. subheadings)
split_rows = c(1,9,24,32,36,40,44,48,55,62)
# Extract the subheadings
sub_table_names = table %>% filter(row_number() %in% split_rows) %>% pull(V1)
# Now, use lapply to filter the rows that are between the splits, and use as.numeric and str_remove_all to convert to numeric values
tables = lapply(seq_along(split_rows), function(x) {
table %>%
filter(between(row_number(), split_rows[x]+1, split_rows[x+1]-1 )) %>%
mutate(across(V2:V10, ~as.numeric(str_remove_all(.x,","))))
})
# Name the list of tables
names(tables) <- sub_table_names
Output:
$`Age of primary household maintainer3`
# A tibble: 7 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 15 to 24 years 1030 45 980 220 0 220 21.4 0 22.4
3 25 to 34 years 2700 715 1990 555 40 515 20.6 5.6 25.9
4 35 to 44 years 2795 1360 1440 545 25 520 19.5 1.8 36.1
5 45 to 54 years 3565 2005 1565 740 135 610 20.8 6.7 39
6 55 to 64 years 3535 2225 1315 615 155 455 17.4 7 34.6
7 65 years and over 5380 3685 1700 960 220 735 17.8 6 43.2
$`Household Type4`
# A tibble: 14 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 Couple with children 4360 3145 1220 585 100 485 13.4 3.2 39.8
3 Couple without children 4755 3195 1555 390 70 315 8.2 2.2 20.3
4 Senior-led (65+) couple without children 2030 1695 335 140 50 90 6.9 2.9 26.9
5 Lone-parent household 2220 810 1405 845 135 710 38.1 16.7 50.5
6 Female lone-parent household 1845 660 1190 730 105 625 39.6 15.9 52.5
7 Male lone-parent household 370 155 220 115 30 85 31.1 19.4 38.6
8 Multiple-family household 265 165 100 70 20 45 26.4 12.1 45
9 One-person household 6075 2385 3685 1525 235 1290 25.1 9.9 35
10 Female one-person households 3615 1590 2025 920 135 795 25.4 8.5 39.3
11 Senior (65+) female living alone 1810 980 830 525 90 435 29 9.2 52.4
12 Male one-person household 2455 800 1660 605 105 500 24.6 13.1 30.1
13 Senior (65+) male living alone 600 350 250 170 50 120 28.3 14.3 48
14 Other non-family household 1345 330 1015 230 25 205 17.1 7.6 20.2
$`Immigrant households5`
# A tibble: 7 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 Non-immigrant 12500 7115 5395 1665 230 1440 13.3 3.2 26.7
3 Non-permanent resident6 430 25 400 140 10 130 32.6 40 32.5
4 Immigrant 6085 2890 3190 1825 345 1485 30 11.9 46.6
5 Landed before 2001 4105 2480 1620 1065 275 790 25.9 11.1 48.8
6 Landed 2001 to 2010 1340 340 1000 460 55 400 34.3 16.2 40
7 Recent immigrants (landed 2011-2016)7 640 70 575 310 10 295 48.4 14.3 51.3
$`Households with seniors`
# A tibble: 3 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 Household has at least one senior (65 or older) 5910 4085 1825 1015 245 770 17.2 6 42.2
3 Other household type 13105 5945 7155 2625 340 2285 20 5.7 31.9
$`Households with children under 18`
# A tibble: 3 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 Household has at least one child less than 18 years old 4465 2455 2005 1140 170 975 25.5 6.9 48.6
3 Other household type 14550 7575 6980 2500 420 2080 17.2 5.5 29.8
$`Activity limitations8`
# A tibble: 3 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 Household has at least one person with activity limitations 10955 5830 5120 2285 385 1895 20.9 6.6 37
3 All other households 8060 4195 3865 1360 200 1160 16.9 4.8 30
$`Aboriginal households9`
# A tibble: 3 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 Aboriginal households 655 215 440 120 20 105 18.3 9.3 23.9
3 Non-Aboriginal households 18355 9815 8540 3515 565 2955 19.2 5.8 34.6
$`Incomes, shelter costs10, and STIRs11`
# A tibble: 6 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Average household income before taxes ($) 96464 134172 54357 29101 31212 28696 NA NA NA
2 Average monthly shelter costs ($) 1256 1408 1085 1039 1243 1000 NA NA NA
3 Average STIR before taxes (%) 24 17.2 31.5 46.8 49.7 46.2 NA NA NA
4 Median household income before taxes ($) 72502 107762 44596 27711 28437 27568 NA NA NA
5 Median monthly shelter costs ($) 1097 1193 1076 1013 1115 1006 NA NA NA
6 Median STIR before taxes (%) 19.3 14 26 43.8 45.8 43.3 NA NA NA
$`Housing standards`
# A tibble: 6 x 10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 All Households 19015 10030 8985 3635 585 3055 19.1 5.8 34
2 Housing above standards 12365 8225 4145 0 0 0 0 0 0
3 Below one or more housing standards 6650 1805 4845 3640 585 3055 54.7 32.4 63.1
4 Below affordability standard12 4885 1230 3660 3125 535 2590 64 43.5 70.8
5 Below adequacy standard13 1360 555 810 425 75 350 31.2 13.5 43.2
6 Below suitability standard14 1480 210 1270 800 55 745 54.1 26.2 58.7
you could check if there is more up to date 2018 data by following the crumbs to https://www150.statcan.gc.ca/n1/pub/46-25-0001/462500012021001-eng.htm ,
However, If you only want one row it is easy to save the source with right clicks,
<tr>
<th scope="row">Below one or more housing standards</th>
<td>6,650</td>
<td>1,805</td>
<td>4,845</td>
<td>3,640</td>
<td>585</td>
<td>3,055</td>
<td>54.7</td>
<td>32.4</td>
<td>63.1</td>
</tr>
for the headings you need
HOUSEHOLDS TESTED FOR CORE HOUSING NEED 1 HOUSEHOLDS IN CORE HOUSING NEED 2 % OF HOUSEHOLDS IN CORE HOUSING NEED
TOTAL OWNERS RENTERS TOTAL OWNERS RENTERS TOTAL OWNERS RENTERS
and for footins
1 Data include all non-farm, non-band, non-reserve private households reporting positive incomes and shelter cost-to-income ratios less than 100 per cent.
2 A household is in core housing need if its housing does not meet one or more standards for housing adequacy (repair), suitability (crowding), or affordability and if it would have to spend 30 per cent or more of its before-tax income to pay the median rent (including utilities) of appropriately sized alternative local market housing. Adequate housing does not require any major repairs, according to residents. Suitable housing has enough bedrooms for the size and make-up of resident households. Affordable housing costs less than 30 per cent of before-tax household income.
You have a PDF and want to work with the raw Text but its clear there is some issue with the generated searchable text and we can see that in the headings and with copy and paste. Belowone ormore housing standards so here is the expected extraction from bottom of page 2
pdftotext -f 2 -l 2 -nopgbrk -simple -margint 650 tableexport.pdf -
penguins %>%
select(species,island,sex) %>%
rename(island_new=island) %>%
rename_with(penguins,toupper)
this is code which is causing error, can someone solve the problem
It's implied that the first argument of rename_with is what has been piped to it, so you don't need to pass penguins as the first argument:
penguins %>%
select(species,island,sex) %>%
rename(island_new=island) %>%
rename_with(toupper)
# A tibble: 344 x 3
SPECIES ISLAND_NEW SEX
<fct> <fct> <fct>
1 Adelie Torgersen male
2 Adelie Torgersen female
3 Adelie Torgersen female
4 Adelie Torgersen NA
5 Adelie Torgersen female
6 Adelie Torgersen male
7 Adelie Torgersen female
8 Adelie Torgersen male
9 Adelie Torgersen NA
10 Adelie Torgersen NA