I have 50+ manuscript titles in R markdown and they are directly copied from a word document. I'm wondering whether there's a function or package I can sort these titles by alphabet order so I can list them back in R markdown.
Hospital admission and mortality rates for non-covid diseases in Denmark during covid-19 pandemic: nationwide population based cohort study
Covid-19 deaths in Africa: prospective systematic postmortem surveillance study
Food anaphylaxis in the United Kingdom: analysis of national data, 1998-2018
Association of first trimester prescription opioid use with congenital malformations in the offspring: population based cohort study
In the following example, you should be able to copy your text into the text argument of read.table. sep defines that the entries are separated by line breaks, and empty lines are skipped (blank.lines.skip = T)
df <- read.table(sep = "\n", blank.lines.skip = T, stringsAsFactors=FALSE,
text = "Hospital admission and mortality rates for non-covid diseases in Denmark during covid-19 pandemic: nationwide population based cohort study
Covid-19 deaths in Africa: prospective systematic postmortem surveillance study
Food anaphylaxis in the United Kingdom: analysis of national data, 1998-2018
Association of first trimester prescription opioid use with congenital malformations in the offspring: population based cohort study"
)
df <- sort(df$V1)
df
# [1] "Association of first trimester prescription opioid use with congenital malformations in the offspring: population based cohort study"
# [2] "Covid-19 deaths in Africa: prospective systematic postmortem surveillance study"
# [3] "Food anaphylaxis in the United Kingdom: analysis of national data, 1998-2018"
# [4] "Hospital admission and mortality rates for non-covid diseases in Denmark during covid-19 pandemic: nationwide population based cohort study"
Related
I am going to Evaluation of combined surgical and antibiotic treatment for Diabetic foot ulcers, 30 patients with Diabetic foot ulcers were enrolled in this study, and the date of first and last visit was recorded (treatment duration time in weeks were calculted), I considered this study as single-arm treatment as there I had no control group. I recorded the CRP before and after the treatment, the patients with an absolute difference in CRP less than 10 were considered as healing otherwise no healing will be recorded. How I can start with R cran evaluating my treatment. Statistics approach, and methodology
Thanks in advance.
My data
crp_before = c(96.1,90.4,114.4,88.3,76.1,191.2,69.8,122.3,188.6,77.3,126.8,189.3,165.2,116.8,72.3,120.9,122.3,115.2,90,142.3,87.2,195.5,184.3,110.2,113.6,147.4,96.8,116.4,55.3,209)
crp_after = c(5.3,7,6.2,3.5,4.2,9.6,5.2,5.3,9.6,8,7.6,11,10.3,4.6,3.2,8.6,7.5,8.4,6.3,7.6,6.8,112,6.3,8.5,9.2,5.3,4.1,7.6,3,100)
time_week = c(9,8,12,8,4,24,4,8,24,4,12,24,20,12,5,12,13,12,8,16,8,24,24,8,8,16,8,12,3,4)
I am unable to load Groceries data set in R.
Can anyone help?
> data()
Data sets in package ‘datasets’:
AirPassengers Monthly Airline Passenger Numbers 1949-1960
BJsales Sales Data with Leading Indicator
BJsales.lead (BJsales) Sales Data with Leading Indicator
BOD Biochemical Oxygen Demand
CO2 Carbon Dioxide Uptake in Grass Plants
ChickWeight Weight versus age of chicks on different diets
DNase Elisa assay of DNase
EuStockMarkets Daily Closing Prices of Major European Stock Indices,
1991-1998
Formaldehyde Determination of Formaldehyde
HairEyeColor Hair and Eye Color of Statistics Students
Harman23.cor Harman Example 2.3
Harman74.cor Harman Example 7.4
Indometh Pharmacokinetics of Indomethacin
InsectSprays Effectiveness of Insect Sprays
JohnsonJohnson Quarterly Earnings per Johnson & Johnson Share
LakeHuron Level of Lake Huron 1875-1972
LifeCycleSavings Intercountry Life-Cycle Savings Data
Loblolly Growth of Loblolly pine trees
Nile Flow of the River Nile
Orange Growth of Orange Trees
OrchardSprays Potency of Orchard Sprays
PlantGrowth Results from an Experiment on Plant Growth
Puromycin Reaction Velocity of an Enzymatic Reaction
Seatbelts Road Casualties in Great Britain 1969-84
Theoph Pharmacokinetics of Theophylline
Titanic Survival of passengers on the Titanic
ToothGrowth The Effect of Vitamin C on Tooth Growth in Guinea Pigs
UCBAdmissions Student Admissions at UC Berkeley
UKDriverDeaths Road Casualties in Great Britain 1969-84
UKgas UK Quarterly Gas Consumption
USAccDeaths Accidental Deaths in the US 1973-1978
USArrests Violent Crime Rates by US State
USJudgeRatings Lawyers' Ratings of State Judges in the US Superior Court
USPersonalExpenditure Personal Expenditure Data
UScitiesD Distances Between European Cities and Between US Cities
VADeaths Death Rates in Virginia (1940)
WWWusage Internet Usage per Minute
WorldPhones The World's Telephones
ability.cov Ability and Intelligence Tests
airmiles Passenger Miles on Commercial US Airlines, 1937-1960
airquality New York Air Quality Measurements
anscombe Anscombe's Quartet of 'Identical' Simple Linear
Regressions
attenu The Joyner-Boore Attenuation Data
attitude The Chatterjee-Price Attitude Data
austres Quarterly Time Series of the Number of Australian
Residents
beaver1 (beavers) Body Temperature Series of Two Beavers
beaver2 (beavers) Body Temperature Series of Two Beavers
cars Speed and Stopping Distances of Cars
chickwts Chicken Weights by Feed Type
co2 Mauna Loa Atmospheric CO2 Concentration
crimtab Student's 3000 Criminals Data
discoveries Yearly Numbers of Important Discoveries
esoph Smoking, Alcohol and (O)esophageal Cancer
euro Conversion Rates of Euro Currencies
euro.cross (euro) Conversion Rates of Euro Currencies
eurodist Distances Between European Cities and Between US Cities
faithful Old Faithful Geyser Data
fdeaths (UKLungDeaths) Monthly Deaths from Lung Diseases in the UK
freeny Freeny's Revenue Data
freeny.x (freeny) Freeny's Revenue Data
freeny.y (freeny) Freeny's Revenue Data
infert Infertility after Spontaneous and Induced Abortion
iris Edgar Anderson's Iris Data
iris3 Edgar Anderson's Iris Data
islands Areas of the World's Major Landmasses
ldeaths (UKLungDeaths) Monthly Deaths from Lung Diseases in the UK
lh Luteinizing Hormone in Blood Samples
longley Longley's Economic Regression Data
lynx Annual Canadian Lynx trappings 1821-1934
mdeaths (UKLungDeaths) Monthly Deaths from Lung Diseases in the UK
morley Michelson Speed of Light Data
mtcars Motor Trend Car Road Tests
nhtemp Average Yearly Temperatures in New Haven
nottem Average Monthly Temperatures at Nottingham, 1920-1939
npk Classical N, P, K Factorial Experiment
occupationalStatus Occupational Status of Fathers and their Sons
precip Annual Precipitation in US Cities
presidents Quarterly Approval Ratings of US Presidents
pressure Vapor Pressure of Mercury as a Function of Temperature
quakes Locations of Earthquakes off Fiji
randu Random Numbers from Congruential Generator RANDU
rivers Lengths of Major North American Rivers
rock Measurements on Petroleum Rock Samples
sleep Student's Sleep Data
stack.loss (stackloss) Brownlee's Stack Loss Plant Data
stack.x (stackloss) Brownlee's Stack Loss Plant Data
stackloss Brownlee's Stack Loss Plant Data
state.abb (state) US State Facts and Figures
state.area (state) US State Facts and Figures
state.center (state) US State Facts and Figures
state.division (state) US State Facts and Figures
state.name (state) US State Facts and Figures
state.region (state) US State Facts and Figures
state.x77 (state) US State Facts and Figures
sunspot.month Monthly Sunspot Data, from 1749 to "Present"
sunspot.year Yearly Sunspot Data, 1700-1988
sunspots Monthly Sunspot Numbers, 1749-1983
swiss Swiss Fertility and Socioeconomic Indicators (1888) Data
treering Yearly Treering Data, -6000-1979
trees Diameter, Height and Volume for Black Cherry Trees
uspop Populations Recorded by the US Census
volcano Topographic Information on Auckland's Maunga Whau Volcano
warpbreaks The Number of Breaks in Yarn during Weaving
women Average Heights and Weights for American Women
Use ‘data(package = .packages(all.available = TRUE))’
to list the data sets in all *available* packages.
> head(Groceries)
Error in head(Groceries) : object 'Groceries' not found
> groceries <- data(Groceries)
Warning message:
In data(Groceries) : data set ‘Groceries’ not found
> library(datasets)
> groceries <- data(Groceries)
Warning message:
In data(Groceries) : data set ‘Groceries’ not found
>
Groceries is in the arules package.
install.packages("arules")
library(arules)
data(Groceries)
I need to extract the journal titles from a bibliography list. The titles are all within quotation marks.
So is there a way to ask R to extract all text that is within parenthesis?
I have read the list into R as a text file:
"data <- readLines("Publications _ CCDM.txt")"
here are a few lines from the list:
Andronis, C.E., Hane, J., Bringans, S., Hardy, G., Jacques, S., Lipscombe, R., Tan, K-C. (2020). “Gene validation and remodelling using proteogenomics of Phytophthora cinnamomi, the causal agent of Dieback.” bioRxiv. DOI: https://doi.org/10.1101/2020.10.25.354530
Beccari, G., Prodi, A., Senatore, M.T., Balmas, V,. Tini, F., Onofri, A., Pedini, L., Sulyok, M,. Brocca, L., Covarelli, L. (2020). “Cultivation Area Affects the Presence of Fungal Communities and Secondary Metabolites in Italian Durum Wheat Grains.” Toxins https://www.mdpi.com/2072-6651/12/2/97
Corsi, B., Percvial-Alwyn, L., Downie, R.C., Venturini, L., Iagallo, E.M., Campos Mantello, C., McCormick-Barnes, C., See, P.T., Oliver, R.P., Moffat, C.S., Cockram, J. “Genetic analysis of wheat sensitivity to the ToxB fungal effector from Pyrenophora tritici-repentis, the causal agent of tan spot” Theoretical and Applied Genetics. https://doi.org/10.1007/s00122-019-03517-8
Derbyshire, M.C., (2020) Bioinformatic Detection of Positive Selection Pressure in Plant Pathogens: The Neutral Theory of Molecular Sequence Evolution in Action. (2020) Frontiers in Microbiology. https://doi.org/10.3389/fmicb.2020.00644
Dodhia, K.N., Cox, B.A., Oliver, R.P., Lopez-Ruiz, F.J. (2020). “When time really is money: in situ quantification of the strobilurin resistance mutation G143A in the wheat pathogen Blumeria graminis f. sp. tritici.” bioRxiv, doi: https://doi.org/10.1101/2020.08.20.258921
Graham-Taylor, C., Kamphuis, L.G., Derbyshire, M.C. (2020). “A detailed in silico analysis of secondary metabolite biosynthesis clusters in the genome of the broad host range plant pathogenic fungus Sclerotinia sclerotiorum.” BMC Genomics https://doi.org/10.1186/s12864-019-6424-4
try something like this:
library(stringr)
str_extract_all(x, "“.*?”") %>% .[[1]]
if you want to remove quotation from result add this at the end of pipeline:
str_remove_all("[“”]")
Output:
[1] "Gene validation and remodelling using proteogenomics of Phytophthora cinnamomi, the causal agent of Dieback."
[2] "Cultivation Area Affects the Presence of Fungal Communities and Secondary Metabolites in Italian Durum Wheat Grains."
[3] "Genetic analysis of wheat sensitivity to the ToxB fungal effector from Pyrenophora tritici-repentis, the causal agent of tan spot"
[4] "When time really is money: in situ quantification of the strobilurin resistance mutation G143A in the wheat pathogen Blumeria graminis f. sp. tritici."
[5] "A detailed in silico analysis of secondary metabolite biosynthesis clusters in the genome of the broad host range plant pathogenic fungus Sclerotinia sclerotiorum."
I want to find the median/mean/range GDP of a specific Region from my dataset.
For the summary of all Regions (Africa, Asia, Europe etc.) I put:
summary(data$GDP, na.rm=TRUE)
This displays all summary statistics of the GDP for all regions. However, I want only only regions summary statistics e.g. Africa's mean,median,quartile, or Europe. Those are the names of the region so those would be used.
Hi I am trying to extract a single sentence from a paragraph in R
"[report_beginning]
101962493|2011-06-09|final|Omary, Lea, M.D.|43654754|Major Academic Center
_Ms.Wattley is a 88 year-old patient who comes in today with a chief complaint of PREG/SPOTTING.
ALLERGIES: Â none
SOCIAL HISTORY: Â The patient Ms.Wattley is a past smoker who has a visiting nurse. Patient is bed-bound.
PHYSICAL EXAMINATION: Â Blood pressure 125/98, pulse 55, respiratory rate 7, temperature 98.7, and O2 saturation 98 on room air. Â General: Â This is a patient in severe distress. Â
EMERGENCY DEPARTMENT COURSE: Â I confirm that I have seen and evaluated the patient, reviewed the resident's documentation on the patient's chart. The following procedures were performed: Medication:medication given. Procedure:no procedures performed. Testing:testing conducted . Please review the chart for more details.
DISPOSITION: Â The patient was admitted to the hospital with a primary diagnosis of Threatened abortion, antepartum condition or complication.
And so this is one cell. I have a column full of data like this and I want to extract a single line. "PHYSICAL EXAMINATION: Â Blood pressure 125/98, pulse 55, respiratory rate 7, temperature 98.7, and O2 saturation 98 on room air."
How can I do this with Regular expression in R?
I have been using the following code but it doesn't work. It gives me an empty dataset
x=grep("Blood pressure .+ air. ", ed_dia, value = TRUE)
I'm assuming that "[report begiinning is not actually in the data file, so opening a text connection to read the file should succeed:
txt <- "101962493|2011-06-09|final|Omary, Lea, M.D.|43654754|Major Academic Center
_Ms.Wattley is a 88 year-old patient who comes in today with a chief complaint of PREG/SPOTTING.
ALLERGIES: Â none
SOCIAL HISTORY: Â The patient Ms.Wattley is a past smoker who has a visiting nurse. Patient is bed-bound.
PHYSICAL EXAMINATION: Â Blood pressure 125/98, pulse 55, respiratory rate 7, temperature 98.7, and O2 saturation 98 on room air. Â General: Â This is a patient in severe distress. Â
EMERGENCY DEPARTMENT COURSE: Â I confirm that I have seen and evaluated the patient, reviewed the resident's documentation on the patient's chart. The following procedures were performed: Medication:medication given. Procedure:no procedures performed. Testing:testing conducted . Please review the chart for more details.
DISPOSITION: Â The patient was admitted to the hospital with a primary diagnosis of Threatened abortion, antepartum condition or complication. "
inp <- readLines( textConnection(txt))
So after data input it only remains to use grep to identify the lines with "PHYSICAL EXAMINATION" (I wasn't sure if the space may needed special regex-handling) in them and then use "[" to extract from the multiple lines:
inp[ grep("PHYSICAL[ ]EXAMINATION", inp)]
#[1] "PHYSICAL EXAMINATION: Â Blood pressure 125/98, pulse 55, respiratory rate 7, temperature 98.7, and O2 saturation 98 on room air. Â General: Â This is a patient in severe distress. Â "