I tried to cluster my dataset using K-mean, but there is a categorical data in column 9; so when I ran k-mean it had an error like this:
res<-NbClust(mi[2:9],min.nc=2,max.nc=15,method="ward.D2")
Error in t(jeu) %*% jeu :
requires numeric/complex matrix/vector arguments
So I could only run K-mean for columns from 2 to 8. I wonder if there is another way of clustering the data where I could run with column 9 as well?
Data:
df <- structure(list(Name = structure(c(58L, 188L, 40L, 155L, 32L, 88L, 92L, 55L, 135L, 31L, 139L, 26L, 126L, 10L, 166L, 104L, 75L, 180L, 35L, 175L, 77L, 99L, 4L, 71L, 141L, 176L, 53L, 39L, 172L, 196L, 123L, 107L, 16L, 96L, 82L, 185L, 30L, 15L, 94L, 129L, 187L, 151L, 33L, 23L, 28L, 44L, 157L, 69L, 132L, 83L, 131L, 11L, 182L, 181L, 54L, 115L, 116L, 183L, 150L, 195L, 45L, 144L, 1L, 110L, 17L, 114L, 9L, 117L, 112L, 70L, 34L, 169L, 27L, 66L, 3L, 73L, 133L, 91L, 154L, 130L, 160L, 105L, 90L, 165L, 67L, 100L, 162L, 98L, 29L, 68L, 189L, 192L, 102L, 190L, 134L, 136L, 52L, 12L, 81L, 59L, 63L, 122L, 93L, 109L, 178L, 138L, 5L, 43L, 140L, 95L, 2L, 174L, 76L, 51L, 156L, 60L, 149L, 128L, 177L, 142L, 103L, 7L, 8L, 14L, 164L, 74L, 145L, 148L, 113L, 86L, 108L, 48L, 163L, 6L, 186L, 89L, 36L, 191L, 125L, 120L, 62L, 65L, 124L, 168L, 147L, 79L, 173L, 84L, 193L, 25L, 146L, 121L, 127L, 153L, 13L, 106L, 119L, 161L, 49L, 97L, 101L, 61L, 137L, 24L, 85L, 194L, 78L, 41L, 170L, 47L, 118L, 184L, 179L, 72L, 42L, 111L, 87L, 57L, 38L, 37L, 171L, 22L, 50L, 80L, 159L, 18L, 152L, 64L, 56L, 158L, 167L, 46L, 19L, 21L, 20L, 143L), .Label = c("#Mashtag 2013", "#Mashtag 2014", "#Mashtag 2015", "10 Heads High", "5am Saint", "77 Lager", "AB:02", "AB:03", "AB:04", "AB:06", "AB:08", "AB:10", "AB:11", "AB:13", "AB:15", "AB:17", "AB:18", "AB:20", "Ace Of Chinook", "Ace Of Citra", "Ace Of Equinox", "Ace Of Simcoe", "Albino Squid Assasin", "Alice Porter", "All Day Long - Prototype Challenge", "Alpha Dog", "Alpha Pop", "Amarillo - IPA Is Dead", "American Ale", "Anarchist Alchemist", "Arcade Nation", "Avery Brown Dredge", "Baby Dogma", "Baby Saison - B-Sides", "Bad Pixie", "Barley Wine - Russian Doll", "Barrel Aged Albino Squid Assassin", "Barrel Aged Hinterland", "Berliner Weisse With Raspberries And Rhubarb - B-Sides", "Berliner Weisse With Yuzu - B-Sides", "Bitch Please (w/ 3 Floyds)", "Black Dog", "Black Eye Joe (w/ Stone Brewing Co)", "Black Eyed King Imp", "Black Eyed King Imp - Vietnamese Coffee Edition", "Black Hammer", "Black Jacques", "Black Tokyo Horizon (w/Nøgne Ã\230 & Mikkeller)", "Blitz Berliner Weisse", "Blitz Series", "Born To Die", "Bounty Hunter - Shareholder Brew", "Bourbon Baby", "Bracken's Porter", "Bramling X", "Brewdog Vs Beavertown", "Brixton Porter", "Buzz", "Candy Kaiser", "Cap Dog (w/ Cap Brewery)", "Catherine's Pony (w/ Beavertown)", "Challenger", "Chaos Theory", "Chili Hammer", "Chinook - IPA Is Dead", "Citra", "Clown King", "Cocoa Psycho", "Coffee Imperial Stout", "Comet", "Dana - IPA Is Dead", "Dead Metaphor", "Dead Pony Club", "Deaf Mermaid - B-Sides", "Devine Rebel (w/ Mikkeller)", "Dog A", "Dog B", "Dog C", "Dog D", "Dog E", "Dog Fight (w/ Flying Dog)", "Dog Wired (w/8 Wired)", "Dogma", "Doodlebug", "Double IPA - Russian Doll", "Edge", "El Dorado - IPA Is Dead", "Electric India", "Ella - IPA Is Dead", "Elvis Juice V2.0 - Prototype Challenge", "Everday Anarchy", "Fake Lager", "Galaxy", "Goldings - IPA Is Dead", "Growler", "Hardcore IPA", "Hardkogt IPA", "HBC 366 - IPA Is Dead", "HBC 369", "Hello My Name Is Beastie", "Hello My Name Is Holy Moose", "Hello My Name Is Ingrid", "Hello My Name Is Little Ingrid", "Hello My Name Is Mette-Marit", "Hello My Name Is PaÌ\210ivi", "Hello My Name is Sonja (w/ Evil Twin)", "Hello My Name is Vladimir", "Hello My Name Is ZeÌ\201 (w/ 2Cabeças)", "Hinterland", "Hobo Pop", "Hop Fiction - Prototype Challenge", "Hopped-Up Brown Ale - Prototype Challenge", "Hoppy Christmas", "Hops Kill Nazis", "Hunter Foundation Pale Ale", "Hype", "India Session Lager - Prototype Challenge", "International Arms Race (w/ Flying Dog)", "Interstellar", "Jack Hammer", "Jasmine IPA", "Jet Black Heart", "Kohatu - IPA Is Dead", "Konnichiwa Kitsune", "Libertine Black Ale", "Libertine Porter", "Lichtenstein Pale Ale", "Lizard Bride - Prototype Challenge", "Lost Dog (w/Lost Abbey)", "Lumberjack Stout", "Magic Stone Dog (w/Magic Rock & Stone Brewing Co.)", "Mandarina Bavaria - IPA Is Dead", "Mango Gose - B-Sides", "Melon And Cucumber IPA - B-Sides", "Misspent Youth", "Morag's Mojito - B-Sides", "Moshi Moshi 15", "Motueka", "Movember", "Mr.Miyagi's Wasabi Stout", "Nanny State", "Nelson Sauvin", "Neon Overlord", "Never Mind The Anabolics", "No Label", "Nuns With Guns", "Old World India Pale Ale", "Old World Russian Imperial Stout", "Orange Blossom - B-Sides", "Pale - Russian Doll", "Paradox Islay", "Paradox Islay 2.0", "Paradox Jura", "Peroxide Punk", "Pilsen Lager", "Pioneer - IPA Is Dead", "Prototype 27", "Prototype Helles", "Prototype Pils 2.0", "Pumpkin King", "Punk IPA 2007 - 2010", "Punk IPA 2010 - Current", "Restorative Beverage For Invalids And Convalescents", "Rhubarb Saison - B-Sides", "Riptide", "Russian Doll â\200“ India Pale Ale", "Rye Hammer", "San Diego Scotch Ale (w/Ballast Point)", "Santa Paws", "Shareholder Black IPA 2011", "Ship Wreck", "Shipwrecker Circus (w/ Oskar Blues)", "Simcoe", "Sink The Bismarck!", "Skull Candy", "Sorachi Ace", "Sorachi Bitter - B-Sides", "Spiced Cherry Sour - B-Sides", "Stereo Wolf Stout - Prototype Challenge", "Storm", "Sub Hop", "Sunk Punk", "Sunmaid Stout", "Sunshine On Rye - B-Sides", "The Physics", "This. Is. Lager", "TM10", "Trashy Blonde", "Truffle and Chocolate Stout - B-Sides", "U-Boat (w/ Victory Brewing)", "Vagabond Pale ALe - Prototype Challenge", "Vagabond Pilsner", "Vic Secret", "Waimea - IPA Is Dead", "Whisky Sour - B-Sides", "Zephyr"), class = "factor"), ABV = c(4.5, 4.1, 4.2, 6.3, 7.2, NA, 4.7, 7.5, 7.3, 5.3, 4.5, 4.5, 6.1, 11.2, 6, 8.2, 12.5, 8, 4.7, 3.5, 15, 6.7, 7.8, 6.7, 0.5, 7.5, 5.8, 3.6, 10.5, 12.5, 7.2, 8.2, 10.7, 9.2, 7.1, 5, 16.5, 12.8, 6.7, 10, NA, 10, 4.5, 7.4, 7.2, 9.5, 9.2, 9, 7.2, 7.5, NA, 10.43, 7.1, 8, 5, 5.4, 4.1, 10.2, 4, 7, 12.7, 6.5, 7.5, 4.2, 11.8, 7.6, 15, 4.4, 6.3, 7.2, NA, 4.5, 4.5, 7.5, 10, 3.8, 6.4, NA, 4, 15.2, 5.4, 8.3, 6.5, 8, 12, 8.2, 5.6, 7.2, 6.3, 10, 5.6, 4.5, 8.2, 8.4, 6, 6.7, 6.5, 11.5, 8.5, 5.2, 7.1, 4.7, 6.7, 9, 6.5, 6.7, 5, 5.8, 7.5, 4.5, 9, 41, 15, 8.5, 7.2, 9, 3.8, 5.7, 6.3, 7.5, 4.4, 18, 10.5, 11.3, NA, 5.2, 4.5, 9.5, 7.2, 2.7, 6.4, 17.2, 8.5, 4.9, 4.7, 7.2, 10, 4.5, 7.2, 7.2, 6.7, 7.2, 4.4, 9, 7.5, 16.1, 6.7, 2.5, 7.4, 2.8, 4.2, 5.8, 5.2, 10, 12.8, 8.3, 6.5, 6, 3, 7.6, 5.5, 8.8, 5.2, 5.2, 8, 6.7, 15, 11.5, 7.1, NA, 7.5, 7.2, 5.2, 6.8, 5.5, 5.2, 6.7, 5, 9, 9.2, 13.8, 4.5, 3.2, 16.1, 4.7, 14.2, 13, 7.2, 9.2, 4.9, 7.2, 7.2, 4.5, 4.5, 4.5, 7.6), IBU = c(60, 41.5, 8, 55, 59, 38, 40, 75, 30, 60, 50, 42, 45, 150, 70, 70, 100, 60, 45, 33, 90, 67, 70, 70, 55, 75, 35, 8, 85, 125, 70, 70, 100, 125, 65, 47, 20.5, 50, 70, 35, 20, 55, 35, 65, 70, 85, 149, 65, 100, 30, 30, 65, 68, 35, 50, 35, 65, 50, 35, 20, 85, 35, 50, 50, 80, 70, 80, 35, 85, 70, 9, 35, 30, 70, 85, 35, 40, 45, 40, 20, 20, 70, 60, 45, 85, 42, 40, 70, 55, 85, 30, 55, 42, 50, 50, 40, 35, 80, 65, 45, 90, 45, 67, 85, 20, 67, 30, 40, 90, 38, 50, 1085, 90, 85, 100, 80, 20, 35, 130, 75, 35, 70, 14, 50, 25, 65, 25, 80, 70, 36, 50, 75, 100, 30, 37, 100, 80, 55, 50, 250, 67, 100, 70, 70, 80, 85, 70, 35, 70, 30, 25, 40, 50, 55, 70, 70, 55, 60, 8, 175, 35, 40, 45, 55, 85, 70, 90, 50, 80, 45, 0, 130, 55, 30, 60, 40, 70, 50, 85, 65, 60, 40, 8, 100, 25, 20, 100, 250, 50, 18, 250, 250, 40, 40, 40, 70), OG = c(1044, 1041.7, 1040, 1060, 1069, 1045, 1046, 1068, 1079, 1052, 1047, 1046, 1067, 1098, 1058, 1076, 1093, 1082, 1047, 1038, 1120, 1013, 1074, 1066, 1007, 1068, 1049, 1040, 1102, 1087, 1067, 1076, 1105, 1085, 1065, 1048.5, 1112, 1096, 1066, 1080, 1048, 1090, 1048, 1069, 1067, 1095, 1083, 1080, 1064, 1080, 1043, 1095, 1056, 1077, 1049, 1050, 1042, 1026, 1041, 1081, 1113.5, 1050, 1070, 1042, 1096, 1073, 1113, 1040, 1063, 1067, 1032, 1048, 1045, 1068, 1098, 1040, 1057, 1081, 1039, 1110, 1055, 1076, 1060, 1075, 1130, 1078, 1055, 1067, 1060, 1098, 1058, 1046, 1078, 1080, 1050, 1063, 1068, 1096, 1078, 1049, 1067, 1055, 1013, 1094, 1060, 1013, 1050, 1053, 1072, 1042.9, 1084, 1085, 1120, 1072, 1064, 1083, 1039, 1053, 1060, 1068, 1045, 1150, 1093, 1098, 1052, 1048, 1043, 1075, 1067, 1033, 1061, 1156, 1068, 1047, 1043, 1064, 1097, 1045, 1068, 1065, 1064, 1064, 1045, 1090, 1069, 1125, 1063, 1027, 1069, 1032.5, 1044, 1060, 1050, 1128, 1108, 1076, 1059, 1056, 1007, 1072, 1053, 1084, 1048, 1053, 1074, 1066, 1120, 1104, 1067, 1089, 1069, 1065, 1052, 1068, 1062, 1048, 1066, 1053, 1094, 1069, 1088, 1045, 1007, 1015, 1008, 1025, 1015, 1065, 1016, 1010, 1065, 1065, 1045, 1045, 1045, 1067), EBC = c(20, 15, 8, 30, 10, 15, 12, 22, 120, 200, 140, 62, 219, 70, 25, NA, 36, 12, 8, 50, 100, 19, 90, 30, 30, 30, 44, NA, 64, 40, 30, 16, 300, 40, 13, 65, 20, 111, 30, 80, 14, 300, 40, 60, 30, 250, 19.5, 97, 12, 46, 15, 23, 14, 15, 110, 11.5, 17, 197, 45, 12, 250, 23, 40, 30, 115, 59, 400, 12, 24, 30, 2, 44, 25, 30, 130, 25, 10, 15, 18, 158, 30, 30, 25, 240, 24, 90, 15, 30, 30, 30, 54, 25, 70, 200, 8, 15, 250, 115, 31.2, 45, 15, 200, 19, 400, NA, 19, 60, 177.3, 200, 18, 20, 40, 100, 15, 12, 180, 6, 25, 14, 30, 30, 57, NA, 164, 10, 16, 10, 195, 30, 57, 20, 128, 15, 12, 10, 12, 65, 20, 150, 15, 19, 12, 30, 190, 50, 400, 30, 10, 30, 42, 19, 35, 17, 300, 79, 30, 50, 17, 9, 40, 25, 190, 35, 165, 35, 30, 100, 38, 71, 15, 50, 14, 200, 86, 230, 13, 30, 200, 400, 60, 25, 18, 8, 500, 25, 67, 300, 15, 78.8, 13, 17, 104, 18, 18, 18, 20), PH = c(4.4, 4.4, 3.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 5.2, 4.4, 4.4, 4.4, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 3.2, 4.4, 4.4, 4.4, 4.4, 4.3, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 4.4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 3.2, 5.2, 4.4, 4.4, 4.4, 5.2, 4.4, 4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 3.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.2, 4.4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.3, 3.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.4, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.2, 4.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 4.4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.3, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 3.2, 4.4, 4.4, 4.5, 4.4, 5.2, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.3, 4.2, 4.4, 4.2, 3.2, 4.4, 4.2, 4, 4.4, 4.4, 4.2, 4.2, 4.4, 4.4, 4.2, 4.2, 4.2, 4.4), AttenuationLevel = c(75, 76, 83, 80, 67, 88.9, 78, 80.9, 74.7, 77, 74.5, 72.8, 70.1, 87, 79.3, 83, 68, 86, 79, 68.4, 98, 79.7, 79.7, 77.3, 28.6, 82.1, 90, 83, 102, 81.2, 82.1, 83, 76.2, 81.2, 85, 79.4, 100, 79.17, 77.3, 85, 89.6, 84.4, 72.9, 82.6, 82.1, 76.8, 83, 76, 84, 70, 81.4, 83.2, 82.1, 79.2, 79, 84, 76.2, 74.5, 75.6, 74, 76.8, 76, 81.4, 76.2, 79.2, 79.5, 84.1, 79.5, 82.6, 82.1, 88, 72.9, 75.6, 82.1, 79.6, 70, 87, 93.8, 76.9, 82, 74.6, 82.9, 83.3, 81.3, 102.3, 83.3, 78, 82.1, 80, 70, 74, 73.9, 83.3, 81.3, 87, 84, 70.6, 79.2, 84.6, 81.6, 80.6, 70, 79.7, 73.4, 87, 79.7, 76, 84.9, 79.2, 81, 82.1, 81.2, 98, 90.3, 84, 83.1, 87, 79.3, 83, 82.1, 73.3, 93.3, 80, 79.6, 87, 79, 79.1, 81.3, 82.1, 70.8, 80.3, 80.8, 95.6, 80.7, 83.7, 84, 79.4, 73.9, 78.6, 84.6, 79.7, 84, 82.9, 80, 82.6, 84, 81, 70.4, 82.6, 63.1, 72.7, 76.7, 80, 89, 81.5, 82.9, 81.4, 82.14, 82.5, 80.6, 79.3, 79.8, 77.1, 75.5, 82.4, 77.3, 98, 85, 79, 94.4, 81.1, 87, 73.1, 76.5, 67.7, 79.2, 77.3, 73.6, 73.4, 82.6, 83, 75.6, 78, 84, 75.6, 75.6, 84.4, 84.6, 81, 78.7, 84.6, 84.6, 75.6, 75.6, 75.6, 82), FermentationTempCelsius = c(19L, 18L, 21L, 9L, 10L, 22L, 10L, 19L, 19L, 19L, 19L, 22L, 18L, 17L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 18L, 19L, 19L, 19L, 19L, 21L, 21L, 21L, 19L, 21L, 21L, 21L, 9L, 19L, 20L, 21L, 19L, 19L, 22L, 21L, 19L, 18L, 19L, 18L, 19L, 19L, 19L, 12L, 23L, 21L, 10L, 9L, 19L, 19L, 19L, 21L, 19L, 19L, 18L, 18L, 21L, 19L, 20L, 20L, 21L, 10L, 19L, 19L, 21L, 19L, 19L, 19L, 21L, 19L, 20L, 23L, 19L, 21L, 19L, 21L, 19L, 20L, 21L, 21L, 19L, 19L, 19L, 21L, 19L, 9L, 22L, 14L, 20L, 19L, 19L, 20L, 18L, 14L, 19L, 19L, 19L, 21L, 20L, 19L, 19L, 19L, 21L, 10L, 21L, 21L, 19L, 18L, 19L, 21L, 20L, 17L, 20L, 19L, 19L, 22L, 19L, 20L, 20L, 19L, 15L, 19L, 19L, 19L, 19L, 21L, 21L, 10L, 12L, 19L, 21L, 19L, 19L, 21L, 19L, 19L, 20L, 21L, 22L, 21L, 99L, 19L, 19L, 22L, 16L, 19L, 19L, 21L, 18L, 21L, 19L, 19L, 19L, 21L, 17L, 21L, 19L, 19L, 19L, 19L, 19L, 21L, 19L, 23L, 19L, 20L, 19L, 19L, 19L, 19L, 19L, 19L, 21L, 18L, 21L, 19L, 21L, 21L, 12L, 21L, 21L, 21L, 21L, 12L, 21L, 21L, 19L, 19L, 19L, 21L), Yeast = structure(c(1L, 1L, 1L, 3L, 3L, 4L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 1L, 2L, 4L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 4L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 3L, 1L, 1L, 4L, 1L, 1L, 1L, 2L, 1L, 1L, 4L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 4L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 1L, 1L, 1L, 2L), .Label = c("Wyeast 1056 - American Ale", "Wyeast 1272 - American Ale II", "Wyeast 2007 - Pilsen Lager", "Wyeast 3711 - French Saison"), class = "factor")), class = "data.frame", row.names = c(NA, -196L))
df
To solve your specific issue, you can generate dummy variables to run your desired clustering.
One way to do it is using the dummy_columns() function from the fastDummies package.
library(fastDummies)
df_dummy <- dummy_columns(df, select_columns = "Yeast", remove_selected_columns = TRUE)
res <- NbClust(df_dummy[2:9], min.nc = 2, max.nc = 15, method = "ward.D2")
As noted in the comments, the better practices for conduncting clustering analysis are more questions for CrossValidated.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I ran multiple imputation to impute missing data for 2 variables of a data frame, then I got a new data frame (with 2 columns for 2 imputed variables).
Now, I want to replace the 2 columns in the original data frame with the two newly imputed columns from my new dataframe.
What should I do?
Original data frame
new data frame for imputed variables
This is the code I used. Only 2 columns in this data frame are missing data, so I only imputed those two. Is that ok? Can you please suggest me a better way?
library("mice")
imi<-mice(subset(data,select=c('ABV','EBC')),m=5,maxit=10)
Data
structure(list(Name = structure(c(58L, 188L, 40L, 155L, 32L,
88L, 92L, 55L, 135L, 31L, 139L, 26L, 126L, 10L, 166L, 104L, 75L,
180L, 35L, 175L, 77L, 99L, 4L, 71L, 141L, 176L, 53L, 39L, 172L,
196L, 123L, 107L, 16L, 96L, 82L, 185L, 30L, 15L, 94L, 129L, 187L,
151L, 33L, 23L, 28L, 44L, 157L, 69L, 132L, 83L, 131L, 11L, 182L,
181L, 54L, 115L, 116L, 183L, 150L, 195L, 45L, 144L, 1L, 110L,
17L, 114L, 9L, 117L, 112L, 70L, 34L, 169L, 27L, 66L, 3L, 73L,
133L, 91L, 154L, 130L, 160L, 105L, 90L, 165L, 67L, 100L, 162L,
98L, 29L, 68L, 189L, 192L, 102L, 190L, 134L, 136L, 52L, 12L,
81L, 59L, 63L, 122L, 93L, 109L, 178L, 138L, 5L, 43L, 140L, 95L,
2L, 174L, 76L, 51L, 156L, 60L, 149L, 128L, 177L, 142L, 103L,
7L, 8L, 14L, 164L, 74L, 145L, 148L, 113L, 86L, 108L, 48L, 163L,
6L, 186L, 89L, 36L, 191L, 125L, 120L, 62L, 65L, 124L, 168L, 147L,
79L, 173L, 84L, 193L, 25L, 146L, 121L, 127L, 153L, 13L, 106L,
119L, 161L, 49L, 97L, 101L, 61L, 137L, 24L, 85L, 194L, 78L, 41L,
170L, 47L, 118L, 184L, 179L, 72L, 42L, 111L, 87L, 57L, 38L, 37L,
171L, 22L, 50L, 80L, 159L, 18L, 152L, 64L, 56L, 158L, 167L, 46L,
19L, 21L, 20L, 143L), .Label = c("#Mashtag 2013", "#Mashtag 2014",
"#Mashtag 2015", "10 Heads High", "5am Saint", "77 Lager", "AB:02",
"AB:03", "AB:04", "AB:06", "AB:08", "AB:10", "AB:11", "AB:13",
"AB:15", "AB:17", "AB:18", "AB:20", "Ace Of Chinook", "Ace Of Citra",
"Ace Of Equinox", "Ace Of Simcoe", "Albino Squid Assasin", "Alice Porter",
"All Day Long - Prototype Challenge", "Alpha Dog", "Alpha Pop",
"Amarillo - IPA Is Dead", "American Ale", "Anarchist Alchemist",
"Arcade Nation", "Avery Brown Dredge", "Baby Dogma", "Baby Saison - B-Sides",
"Bad Pixie", "Barley Wine - Russian Doll", "Barrel Aged Albino Squid Assassin",
"Barrel Aged Hinterland", "Berliner Weisse With Raspberries And Rhubarb - B-Sides",
"Berliner Weisse With Yuzu - B-Sides", "Bitch Please (w/ 3 Floyds)",
"Black Dog", "Black Eye Joe (w/ Stone Brewing Co)", "Black Eyed King Imp",
"Black Eyed King Imp - Vietnamese Coffee Edition", "Black Hammer",
"Black Jacques", "Black Tokyo Horizon (w/Nøgne Ã\230 & Mikkeller)",
"Blitz Berliner Weisse", "Blitz Series", "Born To Die", "Bounty Hunter - Shareholder Brew",
"Bourbon Baby", "Bracken's Porter", "Bramling X", "Brewdog Vs Beavertown",
"Brixton Porter", "Buzz", "Candy Kaiser", "Cap Dog (w/ Cap Brewery)",
"Catherine's Pony (w/ Beavertown)", "Challenger", "Chaos Theory",
"Chili Hammer", "Chinook - IPA Is Dead", "Citra", "Clown King",
"Cocoa Psycho", "Coffee Imperial Stout", "Comet", "Dana - IPA Is Dead",
"Dead Metaphor", "Dead Pony Club", "Deaf Mermaid - B-Sides",
"Devine Rebel (w/ Mikkeller)", "Dog A", "Dog B", "Dog C", "Dog D",
"Dog E", "Dog Fight (w/ Flying Dog)", "Dog Wired (w/8 Wired)",
"Dogma", "Doodlebug", "Double IPA - Russian Doll", "Edge", "El Dorado - IPA Is Dead",
"Electric India", "Ella - IPA Is Dead", "Elvis Juice V2.0 - Prototype Challenge",
"Everday Anarchy", "Fake Lager", "Galaxy", "Goldings - IPA Is Dead",
"Growler", "Hardcore IPA", "Hardkogt IPA", "HBC 366 - IPA Is Dead",
"HBC 369", "Hello My Name Is Beastie", "Hello My Name Is Holy Moose",
"Hello My Name Is Ingrid", "Hello My Name Is Little Ingrid",
"Hello My Name Is Mette-Marit", "Hello My Name Is PaÌ\210ivi",
"Hello My Name is Sonja (w/ Evil Twin)", "Hello My Name is Vladimir",
"Hello My Name Is ZeÌ\201 (w/ 2Cabeças)", "Hinterland", "Hobo Pop",
"Hop Fiction - Prototype Challenge", "Hopped-Up Brown Ale - Prototype Challenge",
"Hoppy Christmas", "Hops Kill Nazis", "Hunter Foundation Pale Ale",
"Hype", "India Session Lager - Prototype Challenge", "International Arms Race (w/ Flying Dog)",
"Interstellar", "Jack Hammer", "Jasmine IPA", "Jet Black Heart",
"Kohatu - IPA Is Dead", "Konnichiwa Kitsune", "Libertine Black Ale",
"Libertine Porter", "Lichtenstein Pale Ale", "Lizard Bride - Prototype Challenge",
"Lost Dog (w/Lost Abbey)", "Lumberjack Stout", "Magic Stone Dog (w/Magic Rock & Stone Brewing Co.)",
"Mandarina Bavaria - IPA Is Dead", "Mango Gose - B-Sides", "Melon And Cucumber IPA - B-Sides",
"Misspent Youth", "Morag's Mojito - B-Sides", "Moshi Moshi 15",
"Motueka", "Movember", "Mr.Miyagi's Wasabi Stout", "Nanny State",
"Nelson Sauvin", "Neon Overlord", "Never Mind The Anabolics",
"No Label", "Nuns With Guns", "Old World India Pale Ale", "Old World Russian Imperial Stout",
"Orange Blossom - B-Sides", "Pale - Russian Doll", "Paradox Islay",
"Paradox Islay 2.0", "Paradox Jura", "Peroxide Punk", "Pilsen Lager",
"Pioneer - IPA Is Dead", "Prototype 27", "Prototype Helles",
"Prototype Pils 2.0", "Pumpkin King", "Punk IPA 2007 - 2010",
"Punk IPA 2010 - Current", "Restorative Beverage For Invalids And Convalescents",
"Rhubarb Saison - B-Sides", "Riptide", "Russian Doll â\200“ India Pale Ale",
"Rye Hammer", "San Diego Scotch Ale (w/Ballast Point)", "Santa Paws",
"Shareholder Black IPA 2011", "Ship Wreck", "Shipwrecker Circus (w/ Oskar Blues)",
"Simcoe", "Sink The Bismarck!", "Skull Candy", "Sorachi Ace",
"Sorachi Bitter - B-Sides", "Spiced Cherry Sour - B-Sides", "Stereo Wolf Stout - Prototype Challenge",
"Storm", "Sub Hop", "Sunk Punk", "Sunmaid Stout", "Sunshine On Rye - B-Sides",
"The Physics", "This. Is. Lager", "TM10", "Trashy Blonde", "Truffle and Chocolate Stout - B-Sides",
"U-Boat (w/ Victory Brewing)", "Vagabond Pale ALe - Prototype Challenge",
"Vagabond Pilsner", "Vic Secret", "Waimea - IPA Is Dead", "Whisky Sour - B-Sides",
"Zephyr"), class = "factor"), ABV = c(4.5, 4.1, 4.2, 6.3, 7.2,
NA, 4.7, 7.5, 7.3, 5.3, 4.5, 4.5, 6.1, 11.2, 6, 8.2, 12.5, 8,
4.7, 3.5, 15, 6.7, 7.8, 6.7, 0.5, 7.5, 5.8, 3.6, 10.5, 12.5,
7.2, 8.2, 10.7, 9.2, 7.1, 5, 16.5, 12.8, 6.7, 10, NA, 10, 4.5,
7.4, 7.2, 9.5, 9.2, 9, 7.2, 7.5, NA, 10.43, 7.1, 8, 5, 5.4, 4.1,
10.2, 4, 7, 12.7, 6.5, 7.5, 4.2, 11.8, 7.6, 15, 4.4, 6.3, 7.2,
NA, 4.5, 4.5, 7.5, 10, 3.8, 6.4, NA, 4, 15.2, 5.4, 8.3, 6.5,
8, 12, 8.2, 5.6, 7.2, 6.3, 10, 5.6, 4.5, 8.2, 8.4, 6, 6.7, 6.5,
11.5, 8.5, 5.2, 7.1, 4.7, 6.7, 9, 6.5, 6.7, 5, 5.8, 7.5, 4.5,
9, 41, 15, 8.5, 7.2, 9, 3.8, 5.7, 6.3, 7.5, 4.4, 18, 10.5, 11.3,
NA, 5.2, 4.5, 9.5, 7.2, 2.7, 6.4, 17.2, 8.5, 4.9, 4.7, 7.2, 10,
4.5, 7.2, 7.2, 6.7, 7.2, 4.4, 9, 7.5, 16.1, 6.7, 2.5, 7.4, 2.8,
4.2, 5.8, 5.2, 10, 12.8, 8.3, 6.5, 6, 3, 7.6, 5.5, 8.8, 5.2,
5.2, 8, 6.7, 15, 11.5, 7.1, NA, 7.5, 7.2, 5.2, 6.8, 5.5, 5.2,
6.7, 5, 9, 9.2, 13.8, 4.5, 3.2, 16.1, 4.7, 14.2, 13, 7.2, 9.2,
4.9, 7.2, 7.2, 4.5, 4.5, 4.5, 7.6), IBU = c(60, 41.5, 8, 55,
59, 38, 40, 75, 30, 60, 50, 42, 45, 150, 70, 70, 100, 60, 45,
33, 90, 67, 70, 70, 55, 75, 35, 8, 85, 125, 70, 70, 100, 125,
65, 47, 20.5, 50, 70, 35, 20, 55, 35, 65, 70, 85, 149, 65, 100,
30, 30, 65, 68, 35, 50, 35, 65, 50, 35, 20, 85, 35, 50, 50, 80,
70, 80, 35, 85, 70, 9, 35, 30, 70, 85, 35, 40, 45, 40, 20, 20,
70, 60, 45, 85, 42, 40, 70, 55, 85, 30, 55, 42, 50, 50, 40, 35,
80, 65, 45, 90, 45, 67, 85, 20, 67, 30, 40, 90, 38, 50, 1085,
90, 85, 100, 80, 20, 35, 130, 75, 35, 70, 14, 50, 25, 65, 25,
80, 70, 36, 50, 75, 100, 30, 37, 100, 80, 55, 50, 250, 67, 100,
70, 70, 80, 85, 70, 35, 70, 30, 25, 40, 50, 55, 70, 70, 55, 60,
8, 175, 35, 40, 45, 55, 85, 70, 90, 50, 80, 45, 0, 130, 55, 30,
60, 40, 70, 50, 85, 65, 60, 40, 8, 100, 25, 20, 100, 250, 50,
18, 250, 250, 40, 40, 40, 70), OG = c(1044, 1041.7, 1040, 1060,
1069, 1045, 1046, 1068, 1079, 1052, 1047, 1046, 1067, 1098, 1058,
1076, 1093, 1082, 1047, 1038, 1120, 1013, 1074, 1066, 1007, 1068,
1049, 1040, 1102, 1087, 1067, 1076, 1105, 1085, 1065, 1048.5,
1112, 1096, 1066, 1080, 1048, 1090, 1048, 1069, 1067, 1095, 1083,
1080, 1064, 1080, 1043, 1095, 1056, 1077, 1049, 1050, 1042, 1026,
1041, 1081, 1113.5, 1050, 1070, 1042, 1096, 1073, 1113, 1040,
1063, 1067, 1032, 1048, 1045, 1068, 1098, 1040, 1057, 1081, 1039,
1110, 1055, 1076, 1060, 1075, 1130, 1078, 1055, 1067, 1060, 1098,
1058, 1046, 1078, 1080, 1050, 1063, 1068, 1096, 1078, 1049, 1067,
1055, 1013, 1094, 1060, 1013, 1050, 1053, 1072, 1042.9, 1084,
1085, 1120, 1072, 1064, 1083, 1039, 1053, 1060, 1068, 1045, 1150,
1093, 1098, 1052, 1048, 1043, 1075, 1067, 1033, 1061, 1156, 1068,
1047, 1043, 1064, 1097, 1045, 1068, 1065, 1064, 1064, 1045, 1090,
1069, 1125, 1063, 1027, 1069, 1032.5, 1044, 1060, 1050, 1128,
1108, 1076, 1059, 1056, 1007, 1072, 1053, 1084, 1048, 1053, 1074,
1066, 1120, 1104, 1067, 1089, 1069, 1065, 1052, 1068, 1062, 1048,
1066, 1053, 1094, 1069, 1088, 1045, 1007, 1015, 1008, 1025, 1015,
1065, 1016, 1010, 1065, 1065, 1045, 1045, 1045, 1067), EBC = c(20,
15, 8, 30, 10, 15, 12, 22, 120, 200, 140, 62, 219, 70, 25, NA,
36, 12, 8, 50, 100, 19, 90, 30, 30, 30, 44, NA, 64, 40, 30, 16,
300, 40, 13, 65, 20, 111, 30, 80, 14, 300, 40, 60, 30, 250, 19.5,
97, 12, 46, 15, 23, 14, 15, 110, 11.5, 17, 197, 45, 12, 250,
23, 40, 30, 115, 59, 400, 12, 24, 30, 2, 44, 25, 30, 130, 25,
10, 15, 18, 158, 30, 30, 25, 240, 24, 90, 15, 30, 30, 30, 54,
25, 70, 200, 8, 15, 250, 115, 31.2, 45, 15, 200, 19, 400, NA,
19, 60, 177.3, 200, 18, 20, 40, 100, 15, 12, 180, 6, 25, 14,
30, 30, 57, NA, 164, 10, 16, 10, 195, 30, 57, 20, 128, 15, 12,
10, 12, 65, 20, 150, 15, 19, 12, 30, 190, 50, 400, 30, 10, 30,
42, 19, 35, 17, 300, 79, 30, 50, 17, 9, 40, 25, 190, 35, 165,
35, 30, 100, 38, 71, 15, 50, 14, 200, 86, 230, 13, 30, 200, 400,
60, 25, 18, 8, 500, 25, 67, 300, 15, 78.8, 13, 17, 104, 18, 18,
18, 20), PH = c(4.4, 4.4, 3.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4,
4.2, 5.2, 4.4, 4.4, 4.4, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4,
4.4, 4.4, 4.4, 4.4, 4.4, 3.2, 4.4, 4.4, 4.4, 4.4, 4.3, 4.4, 4.4,
4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 4.4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4,
4.4, 4.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 3.2, 5.2,
4.4, 4.4, 4.4, 5.2, 4.4, 4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4,
4.4, 4.4, 3.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4,
4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.2, 4.4, 4.4, 4.2,
4.4, 4.4, 4.4, 4.3, 3.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4,
4.4, 4.4, 5.2, 5.2, 4.4, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2,
4.2, 4.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 4.4, 4.4, 4.2, 4.4,
4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.3, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4,
4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 3.2, 4.4, 4.4, 4.5, 4.4, 5.2, 5.2,
4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4,
4.3, 4.2, 4.4, 4.2, 3.2, 4.4, 4.2, 4, 4.4, 4.4, 4.2, 4.2, 4.4,
4.4, 4.2, 4.2, 4.2, 4.4), AttenuationLevel = c(75, 76, 83, 80,
67, 88.9, 78, 80.9, 74.7, 77, 74.5, 72.8, 70.1, 87, 79.3, 83,
68, 86, 79, 68.4, 98, 79.7, 79.7, 77.3, 28.6, 82.1, 90, 83, 102,
81.2, 82.1, 83, 76.2, 81.2, 85, 79.4, 100, 79.17, 77.3, 85, 89.6,
84.4, 72.9, 82.6, 82.1, 76.8, 83, 76, 84, 70, 81.4, 83.2, 82.1,
79.2, 79, 84, 76.2, 74.5, 75.6, 74, 76.8, 76, 81.4, 76.2, 79.2,
79.5, 84.1, 79.5, 82.6, 82.1, 88, 72.9, 75.6, 82.1, 79.6, 70,
87, 93.8, 76.9, 82, 74.6, 82.9, 83.3, 81.3, 102.3, 83.3, 78,
82.1, 80, 70, 74, 73.9, 83.3, 81.3, 87, 84, 70.6, 79.2, 84.6,
81.6, 80.6, 70, 79.7, 73.4, 87, 79.7, 76, 84.9, 79.2, 81, 82.1,
81.2, 98, 90.3, 84, 83.1, 87, 79.3, 83, 82.1, 73.3, 93.3, 80,
79.6, 87, 79, 79.1, 81.3, 82.1, 70.8, 80.3, 80.8, 95.6, 80.7,
83.7, 84, 79.4, 73.9, 78.6, 84.6, 79.7, 84, 82.9, 80, 82.6, 84,
81, 70.4, 82.6, 63.1, 72.7, 76.7, 80, 89, 81.5, 82.9, 81.4, 82.14,
82.5, 80.6, 79.3, 79.8, 77.1, 75.5, 82.4, 77.3, 98, 85, 79, 94.4,
81.1, 87, 73.1, 76.5, 67.7, 79.2, 77.3, 73.6, 73.4, 82.6, 83,
75.6, 78, 84, 75.6, 75.6, 84.4, 84.6, 81, 78.7, 84.6, 84.6, 75.6,
75.6, 75.6, 82), FermentationTempCelsius = c(19L, 18L, 21L, 9L,
10L, 22L, 10L, 19L, 19L, 19L, 19L, 22L, 18L, 17L, 19L, 19L, 19L,
19L, 19L, 19L, 19L, 19L, 18L, 19L, 19L, 19L, 19L, 21L, 21L, 21L,
19L, 21L, 21L, 21L, 9L, 19L, 20L, 21L, 19L, 19L, 22L, 21L, 19L,
18L, 19L, 18L, 19L, 19L, 19L, 12L, 23L, 21L, 10L, 9L, 19L, 19L,
19L, 21L, 19L, 19L, 18L, 18L, 21L, 19L, 20L, 20L, 21L, 10L, 19L,
19L, 21L, 19L, 19L, 19L, 21L, 19L, 20L, 23L, 19L, 21L, 19L, 21L,
19L, 20L, 21L, 21L, 19L, 19L, 19L, 21L, 19L, 9L, 22L, 14L, 20L,
19L, 19L, 20L, 18L, 14L, 19L, 19L, 19L, 21L, 20L, 19L, 19L, 19L,
21L, 10L, 21L, 21L, 19L, 18L, 19L, 21L, 20L, 17L, 20L, 19L, 19L,
22L, 19L, 20L, 20L, 19L, 15L, 19L, 19L, 19L, 19L, 21L, 21L, 10L,
12L, 19L, 21L, 19L, 19L, 21L, 19L, 19L, 20L, 21L, 22L, 21L, 99L,
19L, 19L, 22L, 16L, 19L, 19L, 21L, 18L, 21L, 19L, 19L, 19L, 21L,
17L, 21L, 19L, 19L, 19L, 19L, 19L, 21L, 19L, 23L, 19L, 20L, 19L,
19L, 19L, 19L, 19L, 19L, 21L, 18L, 21L, 19L, 21L, 21L, 12L, 21L,
21L, 21L, 21L, 12L, 21L, 21L, 19L, 19L, 19L, 21L), Yeast = structure(c(1L,
1L, 1L, 3L, 3L, 4L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,
2L, 3L, 1L, 2L, 2L, 1L, 2L, 4L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L,
3L, 4L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L,
2L, 2L, 3L, 1L, 1L, 4L, 1L, 1L, 1L, 2L, 1L, 1L, 4L, 1L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 3L, 2L, 3L, 1L, 1L, 1L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 3L, 2L, 2L, 2L,
2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 4L, 1L, 1L, 2L, 1L,
1L, 1L, 2L, 2L, 3L, 3L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L,
2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 1L,
1L, 1L, 2L), .Label = c("Wyeast 1056 - American Ale", "Wyeast 1272 - American Ale II",
"Wyeast 2007 - Pilsen Lager", "Wyeast 3711 - French Saison"), class = "factor")), class = "data.frame", row.names = c(NA,
-196L))
Updated
As #dcarlson recommended, you can run mice on the entire dataframe, then you can use complete to get the whole output dataframe. Then, you can join the new data with your original dataframe.
library(tidyverse)
library(mice)
imi <- mice(data, m=5, maxit=10)
imi_complete <- complete(imi)
res <- data %>%
dplyr::left_join(., imi_complete %>% dplyr::select(Name, ABV, EBC), by = "Name") %>%
dplyr::select(-c(ABV.x, EBC.x)) %>%
dplyr::rename("ABV" = ABV.y, "EBC" = EBC.y)
Output
head(res)
Name IBU OG PH AttenuationLevel FermentationTempCelsius Yeast ABV EBC
1 Buzz 60.0 1044.0 4.4 75.0 19 Wyeast 1056 - American Ale 4.5 20
2 Trashy Blonde 41.5 1041.7 4.4 76.0 18 Wyeast 1056 - American Ale 4.1 15
3 Berliner Weisse With Yuzu - B-Sides 8.0 1040.0 3.2 83.0 21 Wyeast 1056 - American Ale 4.2 8
4 Pilsen Lager 55.0 1060.0 4.4 80.0 9 Wyeast 2007 - Pilsen Lager 6.3 30
5 Avery Brown Dredge 59.0 1069.0 4.4 67.0 10 Wyeast 2007 - Pilsen Lager 7.2 10
6 Electric India 38.0 1045.0 4.4 88.9 22 Wyeast 3711 - French Saison 7.5 15
Old
Since there's no id column in the new dataframe, you can just mutate to replace the columns in the original dataframe with the output from the new dataframe. However, it would be better practice to impute directly into the original dataframe (as suggested by #dcarlson and #r2evans), so that you can ensure that you have the data on the correct rows.
library(tidyverse)
df_orig %>%
dplyr::mutate(ABV = df_new$ABV, EBC = df_new$EBC)
Output
id ABV EBC third
1 1 -61 -58 37.94029
2 2 -80 -67 47.81479
3 3 -62 -66 48.85903
4 4 -69 -78 23.18026
5 5 -51 -77 29.91952
Data
df_orig <-
structure(
list(
id = c(1, 2, 3, 4, 5),
ABV = c(
38.9932923251763,
20.0923723727465,
37.640398349613,
31.4673039061017,
49.192731983494
),
EBC = c(
42.341671793256,
32.936319950968,
33.8184517389163,
21.5938150603324,
22.8182014194317
),
third = c(
37.9402944352478,
47.8147878032178,
48.8590325415134,
23.1802612892352,
29.9195193173364
)
),
class = "data.frame",
row.names = c(NA,-5L)
)
df_new <-
structure(
list(
ABV = c(-61,-80,-62,-69,-51),
EBC = c(-58,-67,-66,-78,-77)
),
class = c("rowwise_df", "tbl_df", "tbl",
"data.frame"),
row.names = c(NA,-5L),
groups = structure(
list(.rows = structure(
list(1L, 2L, 3L, 4L, 5L),
ptype = integer(0),
class = c("vctrs_list_of",
"vctrs_vctr", "list")
)),
row.names = c(NA,-5L),
class = c("tbl_df",
"tbl", "data.frame")
)
)
I am trying to use map() to generate a list of ggplot objects that I will combine with ggpubr::ggarrange after.
The stock ggplot call works fine:
# library(tidyverse)
ggplot(Tile,aes(x = Site, y = Volume))+
geom_point(aes(color = Event))+
geom_line(aes(x = Site, y = Volume, group = Event))+
ggtitle(paste("Tile", 'Volume'))
output (desired plot):
Now I use map() with the slightly modified but overall same ggplot call (should be):
x<-map(names(Tile)[-(1:2)], ~
ggplot(Tile,aes(x = Site, y = .x))+
geom_point(aes(color = Event))+
geom_line(aes(x = Site, y = .x, group = Event))+
ggtitle(paste("Tile", as.character(.x)))
)
which gives:
x[[5]]
What am I missing? Thanks.
data:
Tile<-structure(list(Event = c("10/17/2019", "10/17/2019", "10/23/2019",
"10/23/2019", "10/27/2019", "10/27/2019", "10/31/2019", "10/31/2019",
"11/24/2019", "11/24/2019", "11/28/2019", "11/28/2019", "12/10/2019",
"12/10/2019", "12/15/2019", "12/15/2019", "12/28/2019", "12/28/2019",
"12/30/2019", "12/30/2019", "1/3/2020", "1/3/2020", "1/12/2020",
"1/12/2020", "1/26/2020", "1/26/2020", "3/3/2020", "3/3/2020",
"3/8/2020", "3/8/2020", "3/13/2020", "3/13/2020", "5/12/2020",
"5/12/2020", "8/5/2020", "8/5/2020", "9/30/2020", "9/30/2020",
"12/1/2020", "12/1/2020", "12/25/2020", "12/25/2020", "1/17/2021",
"1/17/2021", "3/11/2021", "3/11/2021", "4/16/2021", "4/16/2021",
"4/22/2021", "4/22/2021", "4/30/2021", "4/30/2021", "5/6/2021",
"5/6/2021", "7/2/2021", "7/2/2021", "7/3/2021", "7/3/2021", "7/9/2021",
"7/9/2021", "7/13/2021", "7/13/2021", "7/14/2021", "7/14/2021",
"7/18/2021", "7/18/2021", "7/19/2021", "7/19/2021", "7/21/2021",
"7/21/2021", "7/31/2021", "7/31/2021", "8/2/2021", "8/2/2021",
"8/20/2021", "8/20/2021", "9/9/2021", "9/9/2021", "9/24/2021",
"9/24/2021", "10/17/2021", "10/17/2021", "10/22/2021", "10/22/2021",
"10/25/2021", "10/25/2021", "10/27/2021", "10/27/2021", "11/1/2021",
"11/1/2021"), Site = structure(c(3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L), .Label = c("DCNSUB", "DCNSUR", "DCSSUB", "DCSSUR"
), class = "factor"), TP.conc = c(1550, 2770, NA, NA, 1650, NA,
1810, NA, 666, 468, 1190, 1120, 574, 538, 487, 580, NA, 238,
610, 378, 398, 306, 744, 766, 447, 468, 504, 413, 384, 377, 714,
542, 927.2, 1000.1, 265.4, 285.1, 1527, 1764.5, NA, 460.9, NA,
469.8, NA, 172.8, 454, 432.8, 524.4, 476.4, 300, 303.6, 588.7,
598.1, 852.5, 797.6, 144.4, 122.1, 170.6, 110.1, 301.8, 328.8,
363.3, 498.5, 283.7, 104.9, 314.1, 327.6, 436.6, 262.1, 398.1,
358, 312, 251, 598, 831, 348, 456, 345, 240, 648, 949, 852, 1260,
643, 549, 712, 999, 982, 1100, 1240, 1555), TP.load = c(180.4,
NA, NA, NA, 67.6, NA, 201.5, NA, NA, NA, 53.3, 131.7, 12.1, 38.1,
7, 21.6, NA, NA, 21.2, 44.4, 6.1, 10.9, 79.7, 189.9, 10.5, 27.9,
84.2, 178.7, 13.6, 46.3, 14.4, 26.2, 11.4, 35, 4.2, 10.1, 4.6,
18.9, NA, 58.3, NA, 140.9, NA, 7.6, 181.8, 238.2, 72.5, 97.7,
18.5, 30.6, 121.4, 177.7, 114.1, 166.3, 22.8, 21.9, 15.1, 9.2,
25.8, 29.4, 7.6, 12.3, 3, 1.7, 58.5, 71, 23.3, 15.7, 32.7, 39.7,
3.1, 3.4, 67.2, 126, 49.1, 79.1, 8.6, 5.8, 8.7, 15.3, 38.62755,
62.40857143, NA, NA, NA, NA, NA, NA, NA, NA), SRP.conc = c(NA,
NA, NA, NA, 403, NA, NA, NA, NA, NA, NA, NA, 245, 234, 238, 197,
NA, 118, NA, NA, NA, NA, NA, 270, 121, 135, NA, NA, NA, NA, NA,
NA, 596.7, 635.6, 48, 85.9, 514.8, 572.7, NA, 161.5, NA, 163.3,
NA, 46.4, 96.9, 127, 83.1, 92.3, 53.5, 60.7, 111.7, 133.7, 132.2,
164.1, 50.1, 49.1, 54, 42.5, 122.5, 131.9, 104.2, 194.5, 84.6,
34.8, 90.2, 106.6, NA, NA, 129.9, 118.2, 62.2, 84.7, 105, 152,
92.6, 66, 45.9, 50.5, 66.2, 167, 264, 412, 203, 175, 352, 560,
503, 625, 621, 836), SRP.load = c(NA, NA, NA, NA, 16.5, NA, NA,
NA, NA, NA, NA, NA, 5.2, 16.6, 3.4, 7.3, NA, NA, NA, NA, NA,
NA, NA, 66.9, 2.8, 8, NA, NA, NA, NA, NA, NA, 7.3, 22.2, 0.8,
3.1, 1.6, 6.1, NA, 20.4, NA, 49, NA, 2, 38.8, 69.9, 11.5, 18.9,
3.3, 6.1, 23, 39.7, 17.7, 34.2, 7.9, 8.8, 4.8, 3.6, 10.5, 11.8,
2.2, 4.8, 0.9, 0.6, 16.8, 23.1, NA, NA, 10.7, 13.1, 0.6, 1.2,
11.8, 23, 13.1, 11.4, 1.1, 1.2, 0.9, 2.7, 12, 20.4, NA, NA, NA,
NA, NA, NA, NA, NA), Volume = c(11.64, NA, 1.87, 4.5, 4.1, 9.69,
11.13, 34, NA, NA, 4.48, 11.76, 2.1, 7.08, 1.45, 3.73, NA, NA,
3.47, 11.74, 1.52, 3.56, 10.71, 24.79, 2.34, 5.96, 16.71, 43.28,
3.54, 12.29, 2.02, 4.84, 1.22, 3.5, 1.59, 3.56, 0.3, 1.07, NA,
12.66, NA, 29.99, NA, 4.37, 40.04, 55.03, 13.82, 20.51, 6.18,
10.07, 20.62, 29.72, 13.38, 20.85, 15.76, 17.96, 8.82, 8.36,
8.56, 8.94, 2.1, 2.46, 1.07, 1.58, 18.64, 21.67, 5.33, 6, 8.22,
11.09, 0.99, 1.36, 11.23, 15.16, 14.11, 17.34, 2.48, 2.4, 1.34,
1.61, 4.53, 4.95, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(1L,
4L, 5L, 8L, 9L, 12L, 13L, 16L, 17L, 20L, 21L, 24L, 25L, 28L,
29L, 32L, 33L, 36L, 37L, 40L, 41L, 44L, 45L, 48L, 49L, 52L, 53L,
56L, 57L, 60L, 61L, 64L, 65L, 68L, 69L, 72L, 73L, 76L, 77L, 80L,
81L, 84L, 85L, 88L, 89L, 92L, 93L, 96L, 97L, 100L, 101L, 104L,
105L, 108L, 109L, 112L, 113L, 116L, 117L, 120L, 121L, 124L, 125L,
128L, 129L, 132L, 133L, 136L, 137L, 140L, 141L, 144L, 145L, 148L,
149L, 152L, 153L, 156L, 157L, 160L, 161L, 164L, 165L, 168L, 169L,
172L, 173L, 176L, 177L, 180L), class = "data.frame")
The .x is the individual column names, as aes can take only symbol or unquoted column names, use .data to subset the data column or another option is to convert to symbol and evaluate (!!). But, it is better to use .data
library(purrr)
library(ggplot2)
x <- map(names(Tile)[-(1:2)], ~
ggplot(Tile,aes(x = Site, y = .data[[.x]]))+
geom_point(aes(color = Event))+
geom_line(aes(x = Site, y = .data[[.x]], group = Event))+
ggtitle(paste("Surface", .x))
)
-output
x[[5]]
I want to run four linear regressions from four metrics from two sites.
Site DCNSUB is the response variable and DCSSUB is the predictor in the regression.
I only want to regress where I have a complete pair of data for an event.
I do this for one metric at a time using the following dplyr pipe:
mV<-Tile%>% # model V for Volume
select(Event, Site, Volume)%>%
group_by(Event)%>%
filter(!any(is.na(all_of(Volume))))%>% # group by event and remove pairs that are missing volume
ungroup()%>%
mutate(Volume = log(Volume))%>% # take log
pivot_wider(names_from = Site,values_from = Volume) %>% # get responsive and predictor variable data into columns
as.data.frame(.)%>%
lm(DCNSUB~DCSSUB, data = .)
How can incorporate this into a for loop, where each iteration puts a different metric where 'Volume' is in the pipe? Here is my attempt:
for (i in names(Tile[-c(1,2)])){
mX<-Tile%>%
select(Event, Site, i)%>%
group_by(Event)%>%
filter(!any(is.na(all_of(i))))%>% # remove pairs that are missing i, note group by event helps removes the pairs
ungroup()%>%
mutate(i = log(i))%>% # take log
pivot_wider(names_from = Site,values_from = i) %>%# get responsive and predictor variable data into columns
as.data.frame(.)%>%
lm(DCNSUB~DCSSUB, data = .)
}
There have been other posts that use column indexing to call columns, but this doesn't work when trying to mix it with the column I want to remain constant in each loop. Also, those solution are for much less complicated pipes. Any help is appreciated, thanks.
data:
Tile<-structure(list(Event = c("10/17/2019", "10/17/2019", "10/23/2019",
"10/23/2019", "10/27/2019", "10/27/2019", "10/31/2019", "10/31/2019",
"11/24/2019", "11/24/2019", "11/28/2019", "11/28/2019", "12/10/2019",
"12/10/2019", "12/15/2019", "12/15/2019", "12/28/2019", "12/28/2019",
"12/30/2019", "12/30/2019", "1/3/2020", "1/3/2020", "1/12/2020",
"1/12/2020", "1/26/2020", "1/26/2020", "3/3/2020", "3/3/2020",
"3/8/2020", "3/8/2020", "3/13/2020", "3/13/2020", "5/12/2020",
"5/12/2020", "8/5/2020", "8/5/2020", "9/30/2020", "9/30/2020",
"12/1/2020", "12/1/2020", "12/25/2020", "12/25/2020", "1/17/2021",
"1/17/2021", "3/11/2021", "3/11/2021", "4/16/2021", "4/16/2021",
"4/22/2021", "4/22/2021", "4/30/2021", "4/30/2021", "5/6/2021",
"5/6/2021", "7/2/2021", "7/2/2021", "7/3/2021", "7/3/2021", "7/9/2021",
"7/9/2021", "7/13/2021", "7/13/2021", "7/14/2021", "7/14/2021",
"7/18/2021", "7/18/2021", "7/19/2021", "7/19/2021", "7/21/2021",
"7/21/2021", "7/31/2021", "7/31/2021", "8/2/2021", "8/2/2021",
"8/20/2021", "8/20/2021", "9/9/2021", "9/9/2021", "9/24/2021",
"9/24/2021", "10/17/2021", "10/17/2021", "10/22/2021", "10/22/2021",
"10/25/2021", "10/25/2021", "10/27/2021", "10/27/2021", "11/1/2021",
"11/1/2021"), Site = structure(c(3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 1L), .Label = c("DCNSUB", "DCNSUR", "DCSSUB", "DCSSUR"
), class = "factor"), TP.conc = c(1550, 2770, NA, NA, 1650, NA,
1810, NA, 666, 468, 1190, 1120, 574, 538, 487, 580, NA, 238,
610, 378, 398, 306, 744, 766, 447, 468, 504, 413, 384, 377, 714,
542, 927.2, 1000.1, 265.4, 285.1, 1527, 1764.5, NA, 460.9, NA,
469.8, NA, 172.8, 454, 432.8, 524.4, 476.4, 300, 303.6, 588.7,
598.1, 852.5, 797.6, 144.4, 122.1, 170.6, 110.1, 301.8, 328.8,
363.3, 498.5, 283.7, 104.9, 314.1, 327.6, 436.6, 262.1, 398.1,
358, 312, 251, 598, 831, 348, 456, 345, 240, 648, 949, 852, 1260,
643, 549, 712, 999, 982, 1100, 1240, 1555), TP.load = c(180.4,
NA, NA, NA, 67.6, NA, 201.5, NA, NA, NA, 53.3, 131.7, 12.1, 38.1,
7, 21.6, NA, NA, 21.2, 44.4, 6.1, 10.9, 79.7, 189.9, 10.5, 27.9,
84.2, 178.7, 13.6, 46.3, 14.4, 26.2, 11.4, 35, 4.2, 10.1, 4.6,
18.9, NA, 58.3, NA, 140.9, NA, 7.6, 181.8, 238.2, 72.5, 97.7,
18.5, 30.6, 121.4, 177.7, 114.1, 166.3, 22.8, 21.9, 15.1, 9.2,
25.8, 29.4, 7.6, 12.3, 3, 1.7, 58.5, 71, 23.3, 15.7, 32.7, 39.7,
3.1, 3.4, 67.2, 126, 49.1, 79.1, 8.6, 5.8, 8.7, 15.3, 38.62755,
62.40857143, NA, NA, NA, NA, NA, NA, NA, NA), SRP.conc = c(NA,
NA, NA, NA, 403, NA, NA, NA, NA, NA, NA, NA, 245, 234, 238, 197,
NA, 118, NA, NA, NA, NA, NA, 270, 121, 135, NA, NA, NA, NA, NA,
NA, 596.7, 635.6, 48, 85.9, 514.8, 572.7, NA, 161.5, NA, 163.3,
NA, 46.4, 96.9, 127, 83.1, 92.3, 53.5, 60.7, 111.7, 133.7, 132.2,
164.1, 50.1, 49.1, 54, 42.5, 122.5, 131.9, 104.2, 194.5, 84.6,
34.8, 90.2, 106.6, NA, NA, 129.9, 118.2, 62.2, 84.7, 105, 152,
92.6, 66, 45.9, 50.5, 66.2, 167, 264, 412, 203, 175, 352, 560,
503, 625, 621, 836), SRP.load = c(NA, NA, NA, NA, 16.5, NA, NA,
NA, NA, NA, NA, NA, 5.2, 16.6, 3.4, 7.3, NA, NA, NA, NA, NA,
NA, NA, 66.9, 2.8, 8, NA, NA, NA, NA, NA, NA, 7.3, 22.2, 0.8,
3.1, 1.6, 6.1, NA, 20.4, NA, 49, NA, 2, 38.8, 69.9, 11.5, 18.9,
3.3, 6.1, 23, 39.7, 17.7, 34.2, 7.9, 8.8, 4.8, 3.6, 10.5, 11.8,
2.2, 4.8, 0.9, 0.6, 16.8, 23.1, NA, NA, 10.7, 13.1, 0.6, 1.2,
11.8, 23, 13.1, 11.4, 1.1, 1.2, 0.9, 2.7, 12, 20.4, NA, NA, NA,
NA, NA, NA, NA, NA), Volume = c(11.64, NA, 1.87, 4.5, 4.1, 9.69,
11.13, 34, NA, NA, 4.48, 11.76, 2.1, 7.08, 1.45, 3.73, NA, NA,
3.47, 11.74, 1.52, 3.56, 10.71, 24.79, 2.34, 5.96, 16.71, 43.28,
3.54, 12.29, 2.02, 4.84, 1.22, 3.5, 1.59, 3.56, 0.3, 1.07, NA,
12.66, NA, 29.99, NA, 4.37, 40.04, 55.03, 13.82, 20.51, 6.18,
10.07, 20.62, 29.72, 13.38, 20.85, 15.76, 17.96, 8.82, 8.36,
8.56, 8.94, 2.1, 2.46, 1.07, 1.58, 18.64, 21.67, 5.33, 6, 8.22,
11.09, 0.99, 1.36, 11.23, 15.16, 14.11, 17.34, 2.48, 2.4, 1.34,
1.61, 4.53, 4.95, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(1L,
4L, 5L, 8L, 9L, 12L, 13L, 16L, 17L, 20L, 21L, 24L, 25L, 28L,
29L, 32L, 33L, 36L, 37L, 40L, 41L, 44L, 45L, 48L, 49L, 52L, 53L,
56L, 57L, 60L, 61L, 64L, 65L, 68L, 69L, 72L, 73L, 76L, 77L, 80L,
81L, 84L, 85L, 88L, 89L, 92L, 93L, 96L, 97L, 100L, 101L, 104L,
105L, 108L, 109L, 112L, 113L, 116L, 117L, 120L, 121L, 124L, 125L,
128L, 129L, 132L, 133L, 136L, 137L, 140L, 141L, 144L, 145L, 148L,
149L, 152L, 153L, 156L, 157L, 160L, 161L, 164L, 165L, 168L, 169L,
172L, 173L, 176L, 177L, 180L), class = "data.frame")
We may use map/lapply to loop as with for loop it needs a list to store the output which can be created of course, however, the output from map/lapply is itself a list
library(purrr)
library(dplyr)
library(tidyr)
# // loop over the names
out <- map(names(Tile)[-(1:2)], ~ Tile %>%
# // select the columns of interest along with looped column names
select(Event, Site, all_of(.x))%>%
# // grouped by Event and remove groups based on the NA in the looped column
group_by(Event)%>%
filter(!any(is.na(.data[[.x]])))%>%
ungroup()%>%
# // convert the column looped to its `log`
mutate(!! .x := log(.data[[.x]]))%>%
# // reshape from long to wide
pivot_wider(names_from = Site,values_from = all_of(.x)) %>%
# // build the linear model
lm(DCNSUB~DCSSUB, data = .)
)
-output
> out
[[1]]
Call:
lm(formula = DCNSUB ~ DCSSUB, data = .)
Coefficients:
(Intercept) DCSSUB
-1.475 1.231
[[2]]
Call:
lm(formula = DCNSUB ~ DCSSUB, data = .)
Coefficients:
(Intercept) DCSSUB
0.5418 0.9812
[[3]]
Call:
lm(formula = DCNSUB ~ DCSSUB, data = .)
Coefficients:
(Intercept) DCSSUB
0.09282 1.00866
[[4]]
Call:
lm(formula = DCNSUB ~ DCSSUB, data = .)
Coefficients:
(Intercept) DCSSUB
0.7099 0.9064
[[5]]
Call:
lm(formula = DCNSUB ~ DCSSUB, data = .)
Coefficients:
(Intercept) DCSSUB
0.8000 0.8535
From the list output, either use tidy (from broom) to convert to a tibble output or extract the components separately by looping (It can be done in the same map looped earlier though)
map_dfr(out, ~ {
v1 <- summary(.x)
tibble(pval = v1$coefficients[,4][2], MSE = v1$sigma^2)})
# A tibble: 5 × 2
pval MSE
<dbl> <dbl>
1 7.23e-17 0.0773
2 5.81e-13 0.267
3 3.49e-12 0.120
4 3.77e-10 0.238
5 2.10e-15 0.156
For my current projects I'm repeatedly specifying regression models with differing amounts of predictors/covariates on different outcomes. Right now I'm just writing out each model in full, but I'm sure there is a (very much) faster way requiring less code to do what I'm doing.
My example data is repeated measurements dataset of 24 stroke patients in which I assess the effect of three different types of rehabilitation (Group) on functional recovery scores (Outcome 1 to Outcome 4). Each patients functional ability was measured weekly (Time_num) for 8 weeks:
library(tidyverse)
library(magrittr)
library(nlme)
mydata <- structure(list(Subject = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L,
11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L,
13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L,
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L,
16L, 16L, 16L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L,
18L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L,
19L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 23L,
23L, 23L, 23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 24L, 24L,
24L, 24L), Age = c(60, 52.5, 57.1, 63, 65.1, 39, 59.3, 65.3,
61.4, 56.3, 46.4, 58.2, 58, 57.7, 56.6, 42.3, 52.5, 51.8, 43.2,
50.9, 56.7, 67.5, 65, 56.5, 65.5, 45.6, 56.7, 47.9, 65.5, 46.6,
68.6, 52.1, 43.1, 62.1, 62.9, 58.3, 49.6, 42.1, 59.7, 62.9, 56.2,
71.7, 60.5, 59.8, 54.3, 76.1, 56.2, 74.3, 48.7, 69.9, 59.6, 58.4,
55.9, 56.5, 33, 57.1, 63, 53.1, 51.3, 46.9, 57.2, 47, 58, 63.7,
69.8, 57.9, 62.7, 44.8, 51.5, 57, 58.1, 53.3, 57.2, 54.2, 50.2,
60.4, 61.1, 81.3, 59.6, 68.8, 49.2, 51, 53.5, 55.9, 66.7, 60.3,
59.8, 61.6, 63.8, 59.8, 55.5, 57.7, 66.3, 54.7, 56.3, 56.7, 57.7,
63.8, 53.5, 56.1, 49, 44.5, 36, 58.2, 50.8, 56.8, 47.9, 51.1,
53.2, 53.4, 59.3, 42.8, 63.6, 51.2, 49, 62.6, 44.8, 59.9, 44.7,
56, 54.3, 58.7, 62.2, 76.7, 31.4, 65.2, 52.8, 56.7, 52.4, 60.6,
54.8, 43.2, 77.6, 58.1, 49.8, 55.2, 53.6, 54.1, 72.9, 58.7, 51.9,
64.9, 56.6, 61, 71.3, 63.1, 57.4, 56.9, 53.8, 73, 58.9, 60.7,
63.8, 54.6, 74.5, 46.7, 44.2, 56.3, 66.8, 56.5, 43.6, 62.8, 55.3,
53.7, 54.9, 46.6, 51.8, 60.7, 62.9, 61.5, 61.6, 43.6, 66.8, 50.1,
51.6, 69.9, 52.2, 58.1, 62.1, 69.2, 59.1, 55.2, 47.2, 64.5, 54.2,
75.9, 52.9, 62.5, 58, 64.5, 70.7, 60.5), Sex = structure(c(1L,
2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L,
1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L,
1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L,
2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("Male",
"Female"), class = "factor"), Group = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A",
"B", "C"), class = "factor"), Time_num = c(1, 2, 3, 4, 5, 6,
7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3,
4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8,
1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5,
6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2,
3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7,
8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4,
5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1,
2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6,
7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8), First_outcome = c(45L,
45L, 45L, 45L, 80L, 80L, 80L, 90L, 20L, 25L, 25L, 25L, 30L, 35L,
30L, 50L, 50L, 50L, 55L, 70L, 70L, 75L, 90L, 90L, 25L, 25L, 35L,
40L, 60L, 60L, 70L, 80L, 100L, 100L, 100L, 100L, 100L, 100L,
100L, 100L, 20L, 20L, 30L, 50L, 50L, 60L, 85L, 95L, 30L, 35L,
35L, 40L, 50L, 60L, 75L, 85L, 30L, 35L, 45L, 50L, 55L, 65L, 65L,
70L, 40L, 55L, 60L, 70L, 80L, 85L, 90L, 90L, 65L, 65L, 70L, 70L,
80L, 80L, 80L, 80L, 30L, 30L, 40L, 45L, 65L, 85L, 85L, 85L, 25L,
35L, 35L, 35L, 40L, 45L, 45L, 45L, 45L, 45L, 80L, 80L, 80L, 80L,
80L, 80L, 15L, 15L, 10L, 10L, 10L, 20L, 20L, 20L, 35L, 35L, 35L,
45L, 45L, 45L, 50L, 50L, 40L, 40L, 40L, 55L, 55L, 55L, 60L, 65L,
20L, 20L, 30L, 30L, 30L, 30L, 30L, 30L, 35L, 35L, 35L, 40L, 40L,
40L, 40L, 40L, 35L, 35L, 35L, 40L, 40L, 40L, 45L, 45L, 45L, 65L,
65L, 65L, 80L, 85L, 95L, 100L, 45L, 65L, 70L, 90L, 90L, 95L,
95L, 100L, 25L, 30L, 30L, 35L, 40L, 40L, 40L, 40L, 25L, 25L,
30L, 30L, 30L, 30L, 35L, 40L, 15L, 35L, 35L, 35L, 40L, 50L, 65L,
65L), Second_outcome = c(3, 50, 7, 43, -23, 32, 48, 46, 32, 46,
23, 34, 46, -2, 46, 49, 45, 44, 53, 1, 61, 23, 41, 52, 25, 54,
26, -1, 22, 50, 21, 20, 70, 62, 67, 18, 55, 25, 5, 16, 43, 35,
59, 5, -5, 50, 35, 32, 25, 25, 13, 57, 42, 21, 35, 34, 38, 52,
63, 52, 44, 36, 32, 30, 26, 42, 44, 53, 39, 29, 13, 37, 41, 31,
18, 41, 40, 29, 28, 22, 6, -15, 16, 26, 0, 41, 35, 28, 35, 32,
41, 49, 16, 43, 56, 63, 14, 46, 43, 46, 36, -3, 49, 33, 49, 20,
20, 31, 27, 23, 34, 36, 39, 20, 29, 58, 45, 60, 40, 17, 77, 45,
13, 62, 43, 74, 47, 56, 13, 12, 36, 2, 40, 57, 35, 31, 28, 82,
49, 6, 10, 46, 49, 17, 55, 16, 12, -17, -7, 22, 20, -14, 21,
17, 41, 47, 25, 34, 72, 59, 26, 24, 46, 16, 35, 34, 51, 40, 25,
53, 24, 14, 66, 18, 18, 34, 29, 81, 12, 50, 55, 33, 62, 38, 24,
25, 29, 60, 71, -6, 60, 49), Third_outcome = c(87, 78, 94, 93,
78, 84, 72, 81, 82, 81, 86, 72, 80, 82, 77, 82, 79, 71, 82, 79,
86, 86, 76, 73, 80, 74, 81, 73, 81, 80, 65, 84, 73, 85, 87, 78,
77, 70, 85, 80, 77, 73, 75, 85, 67, 87, 90, 84, 71, 73, 81, 72,
74, 74, 85, 90, 75, 70, 81, 69, 81, 73, 79, 74, 76, 77, 82, 80,
87, 87, 82, 81, 76, 80, 79, 71, 81, 77, 74, 78, 73, 79, 77, 78,
94, 78, 71, 82, 81, 80, 79, 70, 68, 82, 78, 68, 66, 82, 80, 71,
73, 79, 83, 71, 80, 78, 82, 73, 86, 76, 75, 81, 84, 84, 85, 80,
83, 79, 75, 77, 82, 89, 78, 74, 79, 82, 73, 86, 77, 81, 84, 84,
73, 80, 82, 81, 81, 83, 81, 79, 84, 82, 75, 75, 80, 67, 81, 82,
82, 80, 80, 80, 76, 81, 82, 85, 86, 81, 89, 78, 84, 79, 80, 77,
85, 88, 78, 81, 82, 81, 82, 77, 74, 86, 81, 73, 80, 77, 81, 76,
83, 76, 81, 79, 76, 83, 77, 79, 71, 77, 82, 87), Fourth_outcome = c(59,
36, 53, 51, 59, 50, 56, 57, 52, 42, 60, 44, 46, 52, 54, 68, 63,
37, 51, 46, 67, 42, 63, 47, 41, 48, 51, 48, 51, 34, 35, 46, 52,
52, 44, 67, 47, 58, 57, 55, 50, 56, 36, 42, 51, 51, 42, 49, 59,
55, 44, 53, 42, 64, 75, 64, 41, 44, 39, 64, 40, 48, 51, 54, 42,
52, 35, 55, 53, 66, 34, 50, 56, 35, 32, 63, 52, 35, 63, 38, 57,
67, 35, 41, 47, 31, 55, 60, 52, 60, 44, 52, 63, 53, 48, 69, 43,
44, 40, 45, 63, 39, 48, 56, 44, 57, 56, 62, 54, 49, 47, 62, 41,
41, 59, 32, 62, 39, 64, 46, 44, 78, 68, 38, 51, 27, 57, 55, 67,
51, 44, 61, 24, 49, 62, 61, 43, 41, 54, 47, 41, 28, 40, 31, 57,
58, 36, 48, 58, 61, 67, 50, 47, 56, 56, 69, 43, 43, 58, 55, 48,
52, 46, 51, 38, 58, 44, 43, 49, 59, 31, 37, 46, 55, 45, 50, 45,
67, 48, 37, 51, 47, 66, 42, 52, 46, 61, 47, 34, 49, 58, 38)), row.names = c(NA,
-192L), class = c("tbl_df", "tbl", "data.frame"))
Which looks as follows:
head(mydata)
# A tibble: 6 x 9
Subject Age Sex Group Time_num First_outcome Second_outcome Third_outcome Fourth_outcome
<int> <dbl> <fct> <fct> <dbl> <int> <dbl> <dbl> <dbl>
1 1 60 Male A 1 45 3 87 59
2 1 52.5 Female A 2 45 50 78 36
3 1 57.1 Female A 3 45 7 94 53
4 1 63 Male A 4 45 43 93 51
5 1 65.1 Male A 5 80 -23 78 59
6 1 39 Female A 6 80 32 84 50
The models that I run now are 2 linear mixed effects models per outcome (using nlme::lme): one just containing Group and one additionally containing Age and Sex. How I do this now is:
# Outcome 1
outcome1_modelA <-
lme(fixed=First_outcome ~ 1 + Time_num*Group,
random= ~1 + Time_num|Subject,
data=mydata,
na.action="na.omit",
method="ML")
outcome1_modelB <-
lme(fixed=First_outcome ~ 1 + Time_num*Group + Time_num*Age + Time_num*Sex,
random= ~1 + Time_num|Subject,
data=mydata,
na.action="na.omit",
method="ML")
# Outcome 2, 3, and finally...
# Outcome 4
outcome4_modelA <-
lme(fixed=Fourth_outcome ~ 1 + Time_num*Group,
random= ~1 + Time_num|Subject,
data=mydata,
na.action="na.omit",
method="ML")
outcome4_modelB <-
lme(fixed=Fourth_outcome ~ 1 + Time_num*Group + Time_num*Age + Time_num*Sex,
random= ~1 + Time_num|Subject,
data=mydata,
na.action="na.omit",
method="ML")
But seeing as I've got even more outcomes and also more models, I'd like to learn a way to make my code more efficient. I've read about for-loops but can't seem to find examples that work for me. Solutions not involving for-loops would also be greatly appreciated!
Create a function to do it with parameters for the parts you want to change - you'll need the as.formula() function
my_model_function <- function(x, y){
fixed_effects <- as.formula(paste(y, "~ 1 +",
paste("Time_num", x, sep="*", collapse=" + ")))
lme(fixed=fixed_effects,
random= ~1 + Time_num|Subject,
data=mydata,
na.action="na.omit",
method="ML")
}
outcome1_modelA <- my_model_function(x = "Group",
y = "First_outcome")
outcome1_modelB <- my_model_function(x = c("Group", "Age", "Sex"),
y = "First_outcome")
To make it more automated then you can created nested loops in lapply()which will return nested lists of your model outputs.
x_values <- list("Group", c("Group", "Age", "Sex"))
y_values <- list("First_outcome", "Second_outcome")
lapply(x_values, function(x_value){
lapply(y_values, function(y_value){
my_model_function(x_value, y_value)
})
})
You could also replace the user defined function here too - really no need to make a function as this allows it to be called repeatedly from one piece of code (I've left it in because I am writing on a phone, it's too much editing, but something akin to)
lapply(list(1,2,3), function(i){
i^2
})
I am trying to create some smoothed histograms and contour plots using ggplot.
An excerpt/sample of my data is as follows:
structure(list(Year = c(14L, 14L, 14L, 15L, 15L, 15L, 15L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L,
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 20L, 20L, 20L,
20L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 23L, 23L, 23L, 23L,
23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 24L, 25L, 25L, 25L,
25L, 25L, 25L, 25L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L,
26L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 28L, 28L, 28L, 28L,
28L), Period = c(1L, 3L, 3L, 1L, 2L, 1L, 3L, 2L, 1L, 3L, 2L,
3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 1L,
1L, 3L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 3L,
2L, 3L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 1L, 3L, 2L,
2L, 3L, 1L, 3L, 1L, 3L, 2L, 1L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 3L, 1L, 2L, 1L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 2L,
1L, 2L, 1L, 1L, 1L, 3L, 2L, 1L, 2L), Power = c(82, 82, 223.6,
164, 119, 74.5, 74.5, 279.5, 82, 67, 112, 149, 119, 119, 238.5,
205, 82, 119, 194, 336, 558.9, 287, 388, 164, 194, 194, 186.3,
119, 119, 89.4, 126.7, 149, 119, 536.6, 402, 298, 298, 342.8,
536, 223.6, 521.6, 186.3, 238.5, 287, 335.3, 335.3, 335.3, 335.3,
335.3, 335.3, 357.7, 313, 782.6, 298, 670.6, 223.5, 335.3, 391,
391, 436, 391, 436, 171.4, 350, 298, 223.6, 298, 634, 223.5,
864.4, 760, 503.5, 63.3, 357.7, 812, 335.3, 298, 298, 335.3,
298, 317, 231, 335.3, 432, 918, 745.2, 424.8, 372.6, 782, 626,
544, 335.3, 372.6, 373, 391.2, 864, 894, 179, 74.5, 391.2), Span = c(12.8,
11, 17.9, 14.5, 12.9, 7.5, 11.13, 14.3, 7.8, 11, 11.7, 12.8,
8.5, 13.3, 14.9, 12, 9.4, 15.95, 16.74, 22.2, 23.4, 14.3, 23.72,
11.9, 14.4, 14.4, 9.7, 8, 9.4, 14.55, 9.1, 8.11, 9.5, 20.73,
22.8, 38.4, 14, 26.5, 30.48, 9.7, 15.5, 9.1, 14.17, 10.1, 14.8,
15.62, 14.05, 14.05, 14.8, 15.24, 14, 12.24, 27.2, 8.84, 22.86,
7.7, 9.5, 9.8, 15.93, 15.93, 15.93, 15.93, 13.08, 15.21, 8.94,
9.6, 10.8, 13.72, 8.9, 26.72, 25, 9.6, 8.84, 11.58, 17.3, 12.5,
12.1, 12.09, 9.8, 15.3, 9.08, 17.75, 15.3, 15.15, 27.4, 22, 13.7,
10.3, 22.76, 22.25, 17.25, 11, 12, 9.5, 14.15, 20.4, 20.4, 14.5,
8.84, 11.35), Length = c(7.6, 9, 10.35, 9.8, 7.9, 6.3, 8.28,
9.4, 6.7, 8.3, 8, 8.7, 7.4, 9.6, 8.9, 7.9, 6.2, 10.25, 10.77,
10.9, 12.6, 9.4, 11.86, 9.8, 9.2, 8.9, 8, 6.5, 6.95, 9.83, 7.3,
6.38, 8.5, 13.27, 13.5, 20.85, 9.2, 14.33, 19.16, 6.5, 9.7, 8.1,
9.68, 7.7, 10.8, 11.89, 10.97, 11.28, 9.5, 11.42, 11, 7.3, 18.2,
7.01, 18.08, 6.8, 6.8, 7.1, 11.5, 11.5, 11.5, 11.5, 9.27, 9.78,
6.17, 6.4, 7.32, 10.74, 6.9, 18.97, 15.1, 7.06, 7.17, 9.5, 10.55,
8.38, 8.7, 8.81, 6.7, 9.42, 5.99, 10.27, 10.22, 11, 19.8, 14.63,
11.2, 6.56, 14.88, 13.81, 12.6, 7, 7.5, 7.2, 9.91, 14.8, 15,
9.8, 7.17, 8.94), Weight = c(1070, 830, 2200, 1946, 1190, 653,
930, 1575, 676, 920, 1353, 1550, 888, 1275, 1537, 1292, 611,
1350, 1700, 3312, 4920, 1510, 3625, 900, 1665, 1640, 1081, 625,
932, 1378, 886, 902, 1070, 5670, 3636, 12925, 2107, 4770, 6060,
1192, 1900, 1050, 2155, 1379, 2858, 3380, 2290, 2290, 2347, 3308,
2630, 1333, 10000, 1351, 6250, 885, 1531, 1438, 3820, 3820, 3820,
3820, 1905, 2646, 1151, 1266, 1575, 2383, 860, 7983, 6200, 1484,
567, 1867, 4350, 1935, 1823, 2253, 1487, 2220, 1244, 2700, 2280,
3652, 8165, 5500, 3568, 1414, 5875, 5460, 4310, 1500, 1795, 1628,
2449, 6900, 6900, 1900, 567, 2102), Speed = c(105L, 145L, 135L,
138L, 140L, 177L, 113L, 230L, 175L, 106L, 140L, 170L, 175L, 157L,
183L, 201L, 209L, 145L, 120L, 135L, 152L, 176L, 140L, 190L, 175L,
175L, 205L, 196L, 165L, 146L, 175L, 222L, 159L, 166L, 158L, 146L,
185L, 120L, 157L, 226L, 205L, 230L, 161L, 251L, 171L, 206L, 171L,
171L, 235L, 161L, 145L, 245L, 183L, 214L, 180L, 220L, 237L, 254L,
169L, 169L, 169L, 169L, 153L, 183L, 261L, 245L, 235L, 200L, 246L,
174L, 180L, 319L, 146L, 251L, 230L, 290L, 230L, 233L, 250L, 255L,
233L, 175L, 230L, 180L, 145L, 185L, 196L, 298L, 183L, 198L, 195L,
300L, 270L, 297L, 225L, 212L, 195L, 197L, 146L, 296L), Range = c(400L,
402L, 500L, 500L, 400L, 350L, 402L, 700L, 525L, 300L, 560L, 550L,
250L, 450L, 700L, 600L, 175L, 450L, 450L, 450L, 600L, 800L, 500L,
600L, 600L, 600L, 600L, 400L, 250L, 400L, 350L, 547L, 450L, 1770L,
800L, 2365L, 925L, 400L, 1205L, 580L, 600L, 600L, 684L, 402L,
563L, 644L, 885L, 885L, 800L, 440L, 557L, 750L, 3600L, 500L,
805L, 330L, 600L, 628L, 1640L, 1640L, 1640L, 1640L, 604L, 1046L,
644L, 500L, 600L, 1046L, 550L, 1585L, 650L, 917L, 515L, 805L,
750L, 1110L, 772L, 1127L, 500L, 850L, 523L, 850L, 900L, 700L,
668L, 700L, 1706L, 600L, 1385L, 1000L, 902L, 600L, 500L, 450L,
579L, 1125L, 1300L, 660L, 515L, 756L)), row.names = c(NA, 100L
), class = "data.frame")
I have the following code:
library(ggplot2)
data <- read.csv("data.csv")
data[,3:8] <- log10(data[,3:8])
# density plots
ggplot( data, aes( Power, group = Period, colour = Period ) ) + geom_density( aes( fill = Period ), alpha = 1 ) + ggtitle("All data")
ggplot( data, aes( Length, group = Period, colour = Period ) ) + geom_density( aes( fill = Period ), alpha = 1 ) + ggtitle("All data")
library(ggplot2)
data <- read.csv("data.csv")
data[,3:8] <- log10(data[,3:8])
ggplot( data, aes(Power, Weight ) ) +
geom_density_2d( ) +
geom_point( aes( colour = Period ), alpha = 3 ) +
theme( legend.position = "bottom")
ggplot( data, aes( Speed, Length ) ) +
geom_density_2d( ) +
geom_point( aes( colour = Period ), alpha = 3 ) +
theme( legend.position = "bottom")
This code produces these plots:
As you can see, Period 1.5 and 2.5 should not exist – only 1, 2, and 3. And, for the contour plots, I would like it to say "Period", rather than "colour", but the same code as used in the smoothed histograms does not seem to work. And lastly, is there a way to centre the heading, so that "All data" is in the middle?
The issue is caused by using a continuous variable for what essentially is a categorical mapping. By using a categorical or factor variable for such mapping you will get what you are after.
Easiest is to coerce that variable in the data set already:
data <- data %>%
mutate(Period = as.factor(Period))
And your plot will look as what you expect (and reducing your alpha value a little for transparent density plots):