I wrote a function where I pass a company name to lookup in a 2nd table a set of records, calculate a complicated result, and return the result.
I want to process all companies and add a value to each record with that result.
I am using the following code:
`aa <- mutate(companies,newcol=sum_rounds(companies$company_name))`
But I get the following warning:
Warning message:
In c("Bwom", "Symple", "TravelTriangle", "Ark Biosciences", "Artizan Biosciences", :
longer object length is not a multiple of shorter object length
(each of these is a company name)
The company dataframe gets a new column, but all values are "false" where actually there should be both true and false.
Any advice would be welcome to a newbie.
Function follows:
sum_rounds<-function(co_name) {
#get records from rounds for the company name passed to the function
#remove NAs from column roundtype too
outval<- rounds %>%
filter(company_name.x==co_name & !is.na(roundtype)) %>%
#sort by date round is announced
arrange(announced_on) %>%
select(roundtype) %>%
#create a string of all round types in order
apply(2,paste,collapse="")
#the values from mixed to "M", venture to "V" and pureangel to "A"
# now see if it is of the form aaaaa (and #) followed by m or v
# in grep: ^ is start of a line and + is for ar least one copy
# [mv] is either m or v
# nice summary is here: http://www.endmemo.com/program/R/gsub.php
#is angel2vc?
angel2vc<-grepl("^a+[mv]+",outval)
#return(list("roundcodes"=outval,"angel2vc"=angel2vc))
return(angel2vc)
}
DPUT from Companies table Follows:
structure(list(company_name = c("Bwom", "Symple", "TravelTriangle",
"Ark Biosciences", "Artizan Biosciences", "Audiense"), domain = c("b-wom.com",
"getsymple.com", "traveltriangle.com", "arkbiosciences.com",
NA, "audiense.com"), country_code = c("ESP", "USA", "USA", "CHN",
"USA", "GBR"), state_code = c(NA, "CA", "VA", NA, "NC", NA),
region = c("Barcelona", "SF Bay Area", "Washington, D.C.",
"Shanghai", "Raleigh", "London"), city = c("Barcelona", "San Francisco",
"Charlottesville", "Shanghai", "Durham", "London"), status = c("operating",
"operating", "operating", "operating", "operating", "operating"
), short_description = c("Bwom is a tool that offers a test and personalized exercises for women's intimate health.",
"Symple is the cloud platform for all your business payments. Pay, get paid, connect.",
"TravelTriangle enables travel enthusiasts to reserve a personalized holiday plan with a local travel agent.",
"Ark Biosciences is a biopharmaceutical company that is dedicated to the discovery and development",
"Artizan Biosciences", "SaaS developer delivering unique consumer insight and engagement capabilities to many of the world’s biggest brands and agencies."
), category_list = c("health care", "cloud computing|machine learning|mobile apps|mobile payments|retail technology",
"e-commerce|personalization|tourism|travel", "health care",
"biopharma", "analytics|apps|marketing|market research|social crm|social media|social media marketing"
), category_group_list = c("health care", "apps|commerce and shopping|data and analytics|financial services|hardware|internet services|mobile|payments|software",
"commerce and shopping|travel and tourism", "health care",
"biotechnology|health care|science and engineering", "apps|data and analytics|design|information technology|internet services|media and entertainment|sales and marketing|software"
), employee_count = c("1 to 10", "11 to 50", "101 to 250",
NA, "1 to 10", "51 to 100"), funding_rounds = c(2L, 1L, 4L,
2L, 2L, 5L), funding_total_usd = c(1075791, 120000, 19900000,
NA, 3e+06, 8013391), founded_on = structure(c(16555, 16770,
15156, 16071, NA, 14975), class = "Date"), first_funding_on = structure(c(16526,
17204, 15492, 16532, 17091, 15294), class = "Date"), last_funding_on = structure(c(17204,
17204, 17204, 17203, 17203, 17203), class = "Date"), closed_on = c(NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_), email = c("hello#b-wom.com", "info#getsymple.com",
"admin#traveltriangle.com", "info#arkbiosciences.com", NA,
"moreinfo#audiense.com"), phone = c(NA, NA, "'+91 98 99 120408",
"###############################################################################################################################################################################################################################################################",
NA, "###############################################################################################################################################################################################################################################################"
), cb_url = c("https://www.crunchbase.com/organization/bwom",
"https://www.crunchbase.com/organization/symple-2", "https://www.crunchbase.com/organization/traveltriangle-com",
"https://www.crunchbase.com/organization/ark-biosciences",
"https://www.crunchbase.com/organization/artizan-biosciences",
"https://www.crunchbase.com/organization/socialbro"), twitter_url = c("https://www.twitter.com/hellobwom",
NA, "https://www.twitter.com/traveltriangle", NA, NA, "https://www.twitter.com/socialbro"
), facebook_url = c("https://www.facebook.com/hellobwom/?fref=ts",
NA, "http://www.facebook.com/traveltriangle", NA, NA, "http://www.facebook.com/socialbro"
), uuid = c("e6096d58-3454-d982-0dbe-7de9b06cd493", "fd0ab78f-0dc4-1f18-21d1-7ce9ff7a173b",
"742043c1-c17a-4526-4ed0-e911e6e9555b", "8e27eb22-ce03-a2af-58ba-53f0f458f49c",
"ed07ac9e-1071-fca0-46d9-42035c2da505", "fed333e5-2754-7413-1e3d-5939d70541d2"
), isbio = c("other", "other", "other", "other", "bio", "other"
), co_type = c("m", "m", "m", "v", "v", "m")), .Names = c("company_name",
"domain", "country_code", "state_code", "region", "city", "status",
"short_description", "category_list", "category_group_list",
"employee_count", "funding_rounds", "funding_total_usd", "founded_on",
"first_funding_on", "last_funding_on", "closed_on", "email",
"phone", "cb_url", "twitter_url", "facebook_url", "uuid", "isbio",
"co_type"), row.names = c(NA, -6L), class = c("tbl_df", "tbl",
"data.frame"))
>
Related
im uing the row names function to track the production capacity of power producing facilities based on the fuel they use. when i go to create a barplot of the data, instead of creating a nice bar plot of the 6 types of fuel im interested in, i instead get a plot that looks like this
bad bar plot
when i reviewed my matrix, i found that my data looks like this enter image description here
does anyone know how i can effectively group this dataset to fix my barplot?
code used
install.packages('ggplot2', 'tidyverse')
install.packages('tidyverse')
library('tidyverse')
Power_Facilities<- read.csv('powerplants (global) - global_power_plants.csv')
drop<-c("secondary.fuel", "other_fuel2", "other_fuel3", "geolocation_source")
PF<-Power_Facilities[,!(names(Power_Facilities) %in% drop)]
PF<-subset(PF,PF$capacity.in.MW>2000)
PF$generated <-(ifelse(is.na (PF$generation_gwh_2021), paste(PF$estimated_generation_gwh_2021), PF$generation_gwh_2021))
PF$generated <-as.numeric(PF$generated)
#PF<- PF [!((PF$generated == "NA") | PF$generated==""), ]
#PF<- PF [!((PF$generated >1)),]
#PF<- PF [!((PF$capacity.in.MW<20)), ]
head(sort(PF$capacity.in.MW, decreasing = TRUE))
tail(sort(PF$capacity.in.MW, decreasing = TRUE))
head(sort(PF$generated, decreasing = TRUE))
tail(sort(PF$generated, decreasing = TRUE))
pf2<-PF%>%group_by(primary_fuel)summarize
barplot((PF2$capacity.in.MW), names.arg =pf2$primary_fuel)
barplot(t(power_matrix), beside = T, las=2, legend.text =T, col = c("blue", "grey"), ylim=c(0, 1000000))
summary(power_matrix)
structure(list(country.code = c("AUS", "AUS", "AUS", "AZE", "BHR",
"BLR", "BEL", "BEL", "BRA", "BRA"), country_long = c("Australia",
"Australia", "Australia", "Azerbaijan", "Bahrain", "Belarus",
"Belgium", "Belgium", "Brazil", "Brazil"), name.of.powerplant = c("Bayswater",
"Liddell", "Loy Yang A", "Azerbaijan TPP", "Alba Power Station",
"Lukoml Thermal Power Plant Belarus", "DOEL 4", "TIHANGE 3",
"Belo Monte", "Ilha Solteira"), capacity.in.MW = c(2640, 2200,
2180, 2400, 2204, 2460, 2910, 2053.8, 3327.45544, 3444), latitude = c(-32.3953,
-32.3713, -38.2536, 40.78, 26.0945, 54.6803, 51.3254, 50.5342,
-3.1264, -20.3822), longitude = c(150.9491, 150.9776, 146.5746,
46.9901, 50.6008, 29.1341, 4.2597, 5.2751, -51.775, -51.3636),
primary_fuel = c("Coal", "Coal", "Coal", "Oil", "Gas", "Gas",
"Nuclear", "Nuclear", "Hydro", "Hydro"), start.date = c(NA,
NA, NA, NA, NA, NA, 1985, 1985, 2016, 1973), owner.of.plant = c("Macquarie Generation",
"Macquarie Generation", "GEAC Great Energy Alliance Corporation",
"AzerEnerji", "Aluminum Bahrain", "", "", "", "", ""), generation_gwh_2021 = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_), estimated_generation_gwh_2021 = c(NA,
NA, NA, NA, NA, NA, NA, NA, 17396.84, 6318.07), generated = c(NA,
NA, NA, NA, NA, NA, NA, NA, 17396.84, 6318.07)), row.names = c(356L,
565L, 573L, 927L, 942L, 1017L, 1044L, 1083L, 1386L, 2164L), class = "data.frame")```
I'd pivot your data to long format and use ggplot2:
library(tidyr)
library(ggplot2)
PF2_long = PF2 %>%
pivot_longer(cols = c(generated, capacity.in.MW), names_to = "measure")
ggplot(PF2_long, aes(x = primary_fuel, y = value, fill = measure)) +
geom_col(position = "dodge") +
scale_fill_manual(values = c("blue", "grey60")) +
labs(
x = "Primary fuel",
y = "MW",
fill = ""
) +
theme_bw()
I've been struggling with this script for the past month and I still haven't been able to answer this question. I know round_any is used in the plyr package but I don't even load it. I checked all my other packages using ls("package: ") and they don't have this function. Nothing else I find online has been able to point me in the right direction. In browser () I am able to see my type is double [4] (S3:integer64). Am I better off just changing my class from integer64 or finding out how to remove round_any?
Error in `mutate()`:
! Problem while computing `..2 = across(...)`.
Caused by error in `across()`:
! Problem while computing column `Property Count`.
Caused by error in `UseMethod()`:
! no applicable method for 'round_any' applied to an object of class "integer64"
Edit:
This .r file contains all the functions and I have another .r file that calls them.
Argument/Call
market_stats_table = key_market_stats(historical_stats, historical_stats_by_class, report_quarter)
Function
total_stats = stats_combined %>%
rename(`Net Absorption` = `Net Absorption QTD - Total`,
`Net Absorption YTD` = `Net Absorption YTD - Total`,
`Construction Deliveries` = `Construction Deliveries QTD`) %>%
filter(Submarket %in% submarket_order) %>%
mutate(Submarket = factor(Submarket, levels = submarket_order, ordered = T)) %>%
arrange(Submarket) %>%
# glimpse()
mutate(across(c(`Direct Vacancy Rate`, `Overall Vacancy Rate`, `Overall Availability Rate`), scales::percent, accuracy = .1),
across(any_of(sum_vars),
scales::dollar, accuracy = 1, style_negative="parens", prefix=""),
across(any_of(c("Full Service Gross Asking Rate", "Lease Rate")),
scales::dollar)) %>%
select(Submarket, all_of(stat_order))
market_stats_table(total_stats,
cell_width = if_else(property_type == "Office", 1.05, .9),
cell_height = if_else(property_type == "Office", .33, .27),
submarket_order,
totals)
}
structure
structure(list(Market = c("Los Angeles", "0.4", NA, "0.5", "0.3",
"New York"), `Property Count` = c("New York", "0.3", NA, "0.2",
"0.9", "New York"), C = c("Chicago", "0.1", NA, "0.4", "0.3",
"DC"), D = c("DC", "0.7", NA, "0", "0.2", "DC"), e = c("Miami",
"0.8", NA, "0.2", "0.1", "Los Angeles")), row.names = c(NA, 6L
), class = "data.frame")
I want to apply the following code to only the first 3 rows (if it's applied to the second 3, it fails to parse. netflix_and_disney$release_year <-year(dmy(netflix_and_disney$release_year))
Is there a way about doing this with this df?
structure(list(show_id = c("00147800", "07019028", "00115433", "70234439", "80058654", "80125979"), title = c("10 Things I Hate About You", "101 Dalmatian Street", "101 Dalmatians", "Transformers Prime", "Transformers: Robots in Disguise", "#realityhigh"), type = c("Movie", "Tv Show", "Movie", "Tv Show", "Tv Show", "Movie"), rating = c("PG-13", "N/A", "G", "TV-Y7-FV", "TV-Y7", "TV-14"), release_year = c("31 Mar 1999", "25 Mar 2019", "27 Nov 1996", "2013", "2016", "2017"), date_added = structure(c(18212, 18320, 18212, 17782, 17782, 17417), class = "Date"), duration = c("97 min", "N/A", "103 min", "1 Season", "1 Season", "99 min"), genre = c("Comedy, Drama, Romance", "Animation, Comedy, Family", "Adventure, Comedy, Crime, Family", "Kids' TV", "Kids' TV", "Comedies"), director = c("Gil Junger", "N/A", "Stephen Herek", NA, NA, "Fernando Lebrija"), country = c("USA", "UK, USA, Canada", "USA, UK", "United States", "United States", "United States"), imdb_rating = c("7.3", "6.2", "5.7", NA, NA, NA), platform = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Disney", "Netflix"), class = "factor")), row.names = c(1L, 2L, 3L, 995L, 996L, 997L), class = "data.frame")
I have tried applying to a subset of the df but has failed to work, as well as applying the which() function
It really all depends on your data and what function you want to apply. But, in principle, you can do this by subsetting your dataframe:
Data:
set.seed(123)
df <- data.frame(
v1 = rnorm(20),
v2 = runif(20),
v3 = sample(20)
)
Here we apply the function meanto the first ten rows of df:
apply(df[1:10,], 1, mean)
1 2 3 4 5 6 7 8 9 10
4.5274415 0.7281229 2.9908109 1.8131179 6.7605775 6.6179570 4.2313168 3.4003004 3.1930399 5.8040552
netflix_and_disney$release_year[1:3] < year(dmy(netflix_and_disney$release_year[1:3]))
I have an imdb dataset where I would like to replace the missing values for budget and box_office_gross, for which I think using multiple imputation would be a way to replace the missing values.
In order to separate the numeric columns from the entire dataset and perform imputation, I tried to subset the variables
> NBCU_Limited <- subset(NBCU_dataLaurel_Modified, select = c(NBCU_dataLaurel_Modified$imdb_votes, NBCU_dataLaurel_Modified$runtime_min, NBCU_dataLaurel_Modified$Budget, NBCU_dataLaurel_Modified$Box_Office_Gross))
Error: NA column indexes not supported
But I get an error because there are NA values in the variables, I cannot negate the rest of the character columns because even they have NA's and I get the same error.
How do I get only these four variables out into a new dataframe so that I can perform multiple imputation on them.
Sample Dataset
Update: The error is causing because I am specifying the data.frame individually in the subset, if I do not specify data.frame and just specify the name of the variable I do not get this error. I am not sure why but that is what causes the error, so maybe this is because of my improper code.
Below is the data,
> dput(Sample)
structure(list(imdbid = c("tt6256056", "tt0085450", "tt5050772",
"tt5069876", "tt0083791", "tt0083929"), title = c("Una Famiglia",
"Doctor Detroit", "Honeytrap", "Maniac 8.2.8", "The Dark Crystal",
"Fast Times at Ridgemont High"), plot = c("N/A", "A timid college professor, conned into posing as a flamboyant pimp, finds himself enjoying his new occupation on the streets.",
"Simeon's evening goes horribly wrong when a young woman tries to pick him up.",
"Maniac: a person afflicted with mania. Mania: A manifestation of bipolar disorder, characterized by profuse and rapidly changing ideas, exaggerated sexuality, gaiety, or irritability, decreased sleep and violent abnormal behavior.",
"On another planet in the distant past, a Gelfling embarks on a quest to find the missing shard of a magical crystal, and so restore order to his world.",
"A group of Southern California high school students are enjoying their most important subjects: sex, drugs and rock n' roll."
), rating = c("N/A", "R", "N/A", "N/A", "PG", "R"), imdb_rating = c(NA,
5.1, NA, NA, 7.2, 7.2), metacritic = c(NA, NA, NA, NA, NA, 67
), dvd_release = structure(c(NA, 1126569600, NA, NA, 939081600,
1099353600), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
production = c("N/A", "Universal", "Array Releasing", "N/A",
"Sony Pictures Home Entertainment", "Universal Pictures"),
actors = c("Patrick Bruel, Fortunato Cerlino, Matilda De Angelis, Ennio Fantastichini",
"Dan Aykroyd, Howard Hesseman, Donna Dixon, Lydia Lei", "Jennifer Nelson, Daemian Greaves, Polina Vasileva, Becki Lloyd",
"Dimitra Aggelou, Giorgos Efthimiou, Stavroula Kontopoulou, Maria-Antouanetta Tatsi",
"Jim Henson, Kathryn Mullen, Frank Oz, Dave Goelz", "Sean Penn, Jennifer Jason Leigh, Judge Reinhold, Robert Romanus"
), imdb_votes = c(NA, 4492, NA, NA, 44862, 76980), poster = c("N/A",
"https://images-na.ssl-images-amazon.com/images/M/MV5BMjhjY2Q4NWEtYTUzZC00YjE2LTk0ZjktNzUyZjIwNmQ0YTkyXkEyXkFqcGdeQXVyMTQxNzMzNDI#._V1_SX300.jpg",
"N/A", "https://images-na.ssl-images-amazon.com/images/M/MV5BZjdmZTRhYzgtOGY4MS00OGM5LWJlNmItYzJiYjZiNmVmYjhkXkEyXkFqcGdeQXVyNDA2NjM2ODk#._V1_SX300.jpg",
"https://images-na.ssl-images-amazon.com/images/M/MV5BMWZlZjk1MGEtYWMzOC00N2EyLWFkOTUtZDM4NGNlY2M0YjVmXkEyXkFqcGdeQXVyNTAyODkwOQ##._V1_SX300.jpg",
"https://images-na.ssl-images-amazon.com/images/M/MV5BYzBlZjE1MDctYjZmZC00ZTJmLWFkOWEtYjdmZDZkODBkZmI2XkEyXkFqcGdeQXVyNjQ2MjQ5NzM#._V1_SX300.jpg"
), director = c("Sebastiano Riso", "Michael Pressman", "Nick Archer",
"Giorgos Efthimiou", "Jim Henson, Frank Oz", "Amy Heckerling"
), release_date = structure(c(1493596800, 421027200, 1448928000,
1431734400, 408931200, 398044800), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Year = c(2017, 1983, 2015, 2015, 1982,
1982), Year_Groups = c("2010-2020", "1980-1989", "2010-2020",
"2010-2020", "1980-1989", "1980-1989"), Month = c("May",
"May", "December", "May", "December", "August"), runtime_min = c(97,
89, NA, 15, 93, 90), genre = c("Drama", "Comedy", "Short, Thriller",
"Short, Horror", "Adventure, Family, Fantasy", "Comedy, Drama"
), awards = c("N/A", "N/A", "N/A", "1 win.", "Nominated for 1 BAFTA Film Award. Another 2 wins & 4 nominations.",
"1 win & 1 nomination."), keywords = c(NA, "pimp|college-professor|voyeurism|voyeur|blue-panties|panties|red-dress|blonde|female-frontal-nudity|female-nudity|nude-girl|nude|bare-breasts|breasts|topless-female-nudity|scantily-clad-female|cleavage|two-word-title|reference-to-joe-frazier|reference-to-yul-brynner|mother-son-relationship|f-word|place-name-in-title|city-name-in-title|dual-identity|prostitution|independent-film|title-spoken-by-character|character-name-in-title",
NA, NA, "mystic|magical-crystal|crystal-shard|sword-and-sorcery|puppetry|crystal|shard|quest|evil|monster|feeding-on-energy|hidden-entrance|giant-crystal|actor-voicing-multiple-characters|planetary-alignment|reunification|three-word-title|dark-fantasy|slow-motion-scene|vampire|surrealism|christ-allegory|cult-film|sorceress|relic|race-against-time|muppet|mission|magic|kingdom|creature|good-versus-evil|directed-by-star|epic|multiple-monsters|invented-language|slavery|orrery|puppet|mutation|darkness|destiny",
"high-school|title-directed-by-female|females-talking-about-sex|unwanted-pregnancy|fired-from-the-job|teacher-student-relationship|irreverence|sexual-awakening|innocence-lost|ensemble-film|coming-of-age|teen-movie|high-school-teacher|advice|ticket-scalping|shopping-mall|loss-of-virginity|female-nudity|brother-sister-relationship|caught-masturbating|california|surfer|teacher|break-up|rock-'n'-roll|virgin|teenager|friendship|drugs|date|surfer-dude|blond-boy|redheaded-boy|generation-x|f-rated|vomiting|sex-scene|cult-film|breasts|jeans|hawaiian-shirt|payphone|teenage-girl|teen-sex-comedy|scantily-clad-female|reference-to-led-zeppelin|dream-girl|underage-girl|jailbait|trophy-wife|voyeur|sexual-promiscuity|sexual-desire|sexual-attraction|lust|sex-on-couch|female-rear-nudity|female-frontal-nudity|panties|cheerleader-uniform|female-removes-her-clothes|cleavage|marijuana|drug-use|teen-angst|surfing|school-life|pregnancy|masturbation|football-player|first-love|employment|bikini|stoner|rock-m... <truncated>
), Budget = c(NA, 10375893, NA, NA, 1.5e+07, 4500000), Box_Office_Gross = c(2.48,
70, 70, 124, 140, 140)), .Names = c("imdbid", "title", "plot",
"rating", "imdb_rating", "metacritic", "dvd_release", "production",
"actors", "imdb_votes", "poster", "director", "release_date",
"Year", "Year_Groups", "Month", "runtime_min", "genre", "awards",
"keywords", "Budget", "Box_Office_Gross"), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
The error is causing because I am specifying the data.frame individually in the subset, if I do not specify data.frame and just specify the name of the variable I do not get this error. I am not sure why but that is what causes the error, so maybe this is because of my improper code. Thanks #Tung for pointing this out.
With the following data (an already melted data frame):
df1<-structure(list(Speciality = structure(27:32, .Label = c("Addiction Medicine",
"Anesthesiology", "Cardiac Electrophysiology", "Cardiology",
"Dermatology", "Emergency Medicine", "Family Medicine", "Gastroenterology",
"General Surgery", "Hematology & Oncology", "Hospitalist", "Internal Medicine",
"Nephrology", "Neurological Surgery", "Neurology", "Obstetrics & Gynecology",
"Otolaryngology", "Pain Medicine", "Pathology", "Pediatric Critical Care Medicine",
"Pediatric Hematology-Oncology", "Pediatric Pulmonology", "Pediatric Radiology",
"Pediatric Surgery", "Pediatrics", "Psychiatry", "Pulmonology",
"Radiation Oncology", "Radiology", "Surgical Oncology", "Urology",
"Vascular Surgery"), class = "factor"), PhysAge = structure(c(5L,
5L, 1L, 3L, 5L, 5L), .Label = c("25-34", "35-44", "45-54", "55-64",
"65+"), class = "factor"), value = c(0.0035, 0.0058, 0.0089, 0, 0.00512820512820513,
0.00512820512820513)), .Names = c("Speciality", "PhysAge", "value"
), row.names = 155:160, class = "data.frame")
How can I reorder in ggplot based on the sum of values for each Speciality in a stacked bar chart. I've found some options where the value is multiple columns, but in this case it's one value column.
Currently plotting by:
ggplot(df,aes(x=Speciality,y=value,fill=PhysAge))+
geom_bar(stat="identity")
You could try
set.seed(1)
df <- rbind(
AgevsPractice.melt,
transform(AgevsPractice.melt, PhysAge="1", value=runif(6, 0, 0.01)),
transform(AgevsPractice.melt, PhysAge="10", value=runif(6, 0, 0.01))
)
ggplot(df,aes(x=reorder(Speciality, value, sum), y=value,fill=PhysAge))+
geom_bar(stat="identity")