i just wanted to scrape the Countries in the world by population table from below table
"https://www.worldometers.info/world-population/population-by-country/"
sample data
You can grab the desired table data using pandas
import pandas as pd
import requests
headers={"User-Agent":"mozilla/5.0"}
url='https://www.worldometers.info/world-population/population-by-country/'
red=requests.get(url,headers=headers).text
df = pd.read_html(red)[0]
print(df)
Output:
0 1 China 1439323776 ... 38 61 % 18.47
%
1 2 India 1380004385 ... 28 35 % 17.70
%
2 3 United States 331002651 ... 38 83 % 4.25
%
3 4 Indonesia 273523615 ... 30 56 % 3.51
%
4 5 Pakistan 220892340 ... 23 35 % 2.83
%
.. ... ... ... ... ... ... ...
230 231 Montserrat 4992 ... N.A. 10 % 0.00
%
231 232 Falkland Islands 3480 ... N.A. 66 % 0.00
%
232 233 Niue 1626 ... N.A. 46 % 0.00
%
233 234 Tokelau 1357 ... N.A. 0 % 0.00
%
234 235 Holy See 801 ... N.A. N.A. 0.00
%
[235 rows x 12 columns]
I am working with temperature (Kelvin) and incident data and have created new columns with Celsius conversion and I would like to round the celsius values to the nearest whole number and also group in groups of 4 numbers. Such as 29.15 celsius is rounded to 29 and grouped in bins(?) of every 4 numbers. The groups would start at zero and contain 4 digits, e.g. 0-3, 4-7, 8-11, 12-15 etc. Sorry I am trying to think of better words to use but I am quite new to R. How would I round and group this way? Below is the code I have used so far and the result, but it isn't rounding or grouping as I need. Thanks so much!
tempDF <- data.frame(Kelvin = seq(240,320)) %>% #define an empty dat frame with temperatures going from 240 - 320 Kelvin
mutate(Celsius = Kelvin - 273.15) %>%
merge(New_AllTime_Temp, by.x = "Kelvin", by.y = "Temp", all.x = TRUE) %>% #Merge New_AllTime_Temp into the empty data frame, mapping each temperature to the data frame
merge(New_Incident_Temp, by.x = "Kelvin", by.y = "temp", all.x = TRUE) %>% #Merge New_Incident_Temp into the empty datframe, keeping temperature mapping
replace(is.na(.), 0) %>% ## Replace NA values with zeroes
mutate(norm_counnt = scales::rescale(counnt, to=c(0,1))) %>%
mutate(norm_incident = scales::rescale(incidents, to=c(0,1))) %>%
mutate(diffs = norm_incident - norm_counnt) %>%
mutate(rounded = round(Celsius, -2:4))```
"Kelvin" "Celsius" "counnt" "incidents" "norm_counnt" "norm_incident" "diffs" "rounded"
"1" 240 -33.15 0 0 0 0 0 0
"2" 241 -32.15 0 0 0 0 0 -30
"3" 242 -31.15 0 0 0 0 0 -31
"4" 243 -30.15 3 0 0.00146056475170399 0 -0.00146056475170399 -30.1
"5" 244 -29.15 9 0 0.00438169425511198 0 -0.00438169425511198 -29.15
"6" 245 -28.15 7 0 0.00340798442064265 0 -0.00340798442064265 -28.15
"7" 246 -27.15 11 1 0.0053554040895813 0.0196078431372549 0.0142524390476736 -27.15
"8" 247 -26.15 15 0 0.00730282375851996 0 -0.00730282375851996 0
"9" 248 -25.15 22 1 0.0107108081791626 0.0196078431372549 0.00889703495809229 -30
"10" 249 -24.15 11 1 0.0053554040895813 0.0196078431372549 0.0142524390476736 -24
"11" 250 -23.15 32 0 0.0155793573515093 0 -0.0155793573515093 -23.1
"12" 251 -22.15 33 0 0.0160662122687439 0 -0.0160662122687439 -22.15
"13" 252 -21.15 47 0 0.0228821811100292 0 -0.0228821811100292 -21.15
"14" 253 -20.15 107 1 0.0520934761441091 0.0196078431372549 -0.0324856330068542 -20.15
"15" 254 -19.15 117 0 0.0569620253164557 0 -0.0569620253164557 0
"16" 255 -18.15 162 2 0.0788704965920156 0.0392156862745098 -0.0396548103175058 -20
"17" 256 -17.15 221 4 0.107594936708861 0.0784313725490196 -0.0291635641598412 -17
"18" 257 -16.15 258 2 0.125608568646543 0.0392156862745098 -0.0863928823720335 -16.1
"19" 258 -15.15 272 3 0.132424537487829 0.0588235294117647 -0.0736010080760639 -15.15
"20" 259 -14.15 314 4 0.152872444011685 0.0784313725490196 -0.0744410714626649 -14.15
"21" 260 -13.15 409 4 0.199123661148978 0.0784313725490196 -0.120692288599958 -13.15
"22" 261 -12.15 478 11 0.232716650438169 0.215686274509804 -0.0170303759283655 0
"23" 262 -11.15 523 13 0.254625121713729 0.254901960784314 0.0002768390705844 -10
"24" 263 -10.15 574 8 0.279454722492697 0.156862745098039 -0.122591977394658 -10
"25" 264 -9.14999999999998 793 9 0.386075949367089 0.176470588235294 -0.209605361131794 -9.1
"26" 265 -8.14999999999998 924 14 0.44985394352483 0.274509803921569 -0.175344139603261 -8.15
"27" 266 -7.14999999999998 1108 18 0.539435248296008 0.352941176470588 -0.186494071825419 -7.15
"28" 267 -6.14999999999998 1082 17 0.526777020447907 0.333333333333333 -0.193443687114573 -6.15
"29" 268 -5.14999999999998 1198 15 0.583252190847128 0.294117647058824 -0.289134543788304 0
"30" 269 -4.14999999999998 1233 13 0.600292112950341 0.254901960784314 -0.345390152166027 0
"31" 270 -3.14999999999998 1244 17 0.605647517039922 0.333333333333333 -0.272314183706589 -3
"32" 271 -2.14999999999998 1496 32 0.728334956183057 0.627450980392157 -0.100883975790901 -2.1
"33" 272 -1.14999999999998 1565 25 0.761927945472249 0.490196078431373 -0.271731867040877 -1.15
"34" 273 -0.149999999999977 1870 35 0.910418695228822 0.686274509803922 -0.2241441854249 -0.15
"35" 274 0.850000000000023 2054 31 1 0.607843137254902 -0.392156862745098 0.85
"36" 275 1.85000000000002 2034 29 0.990262901655307 0.568627450980392 -0.421635450674915 0
"37" 276 2.85000000000002 1974 33 0.961051606621227 0.647058823529412 -0.313992783091815 0
"38" 277 3.85000000000002 1966 32 0.95715676728335 0.627450980392157 -0.329705786891193 4
"39" 278 4.85000000000002 2040 51 0.993184031158715 1 0.00681596884128532 4.9
"40" 279 5.85000000000002 1949 29 0.94888023369036 0.568627450980392 -0.380252782709968 5.85
"41" 280 6.85000000000002 2053 40 0.999513145082765 0.784313725490196 -0.215199419592569 6.85
"42" 281 7.85000000000002 1987 34 0.967380720545277 0.666666666666667 -0.300714053878611 7.85
"43" 282 8.85000000000002 1959 40 0.953748782862707 0.784313725490196 -0.169435057372511 0
"44" 283 9.85000000000002 1770 32 0.861733203505355 0.627450980392157 -0.234282223113199 10
"45" 284 10.85 1816 27 0.88412852969815 0.529411764705882 -0.354716764992268 11
"46" 285 11.85 1859 39 0.905063291139241 0.764705882352941 -0.140357408786299 11.9
"47" 286 12.85 2029 35 0.987828627069133 0.686274509803922 -0.301554117265212 12.85
"48" 287 13.85 1926 33 0.937682570593963 0.647058823529412 -0.290623747064551 13.85
"49" 288 14.85 1848 43 0.899707887049659 0.843137254901961 -0.0565706321476984 14.85
"50" 289 15.85 1823 33 0.887536514118793 0.647058823529412 -0.240477690589381 0
"51" 290 16.85 1662 24 0.809152872444012 0.470588235294118 -0.338564637149894 20
"52" 291 17.85 1578 31 0.7682570593963 0.607843137254902 -0.160413922141398 18
"53" 292 18.85 1425 12 0.693768257059396 0.235294117647059 -0.458474139412337 18.9
"54" 293 19.85 1318 17 0.641674780915287 0.333333333333333 -0.308341447581954 19.85
"55" 294 20.85 1204 19 0.586173320350535 0.372549019607843 -0.213624300742692 20.85
"56" 295 21.85 1029 18 0.500973709834469 0.352941176470588 -0.148032533363881 21.85
"57" 296 22.85 876 12 0.426484907497566 0.235294117647059 -0.191190789850507 0
"58" 297 23.85 735 13 0.357838364167478 0.254901960784314 -0.102936403383164 20
"59" 298 24.85 623 5 0.303310613437196 0.0980392156862745 -0.205271397750921 25
"60" 299 25.85 571 7 0.277994157740993 0.137254901960784 -0.140739255780209 25.9
"61" 300 26.85 512 5 0.249269717624148 0.0980392156862745 -0.151230501937874 26.85
"62" 301 27.85 417 5 0.203018500486855 0.0980392156862745 -0.10497928480058 27.85
"63" 302 28.85 345 14 0.167964946445959 0.274509803921569 0.10654485747561 28.85
"64" 303 29.85 294 6 0.143135345666991 0.117647058823529 -0.0254882868434618 0
"65" 304 30.85 253 3 0.12317429406037 0.0588235294117647 -0.0643507646486053 30
"66" 305 31.85 198 3 0.0963972736124635 0.0588235294117647 -0.0375737442006988 32
"67" 306 32.85 128 2 0.062317429406037 0.0392156862745098 -0.0231017431315272 32.9
"68" 307 33.85 88 2 0.0428432327166504 0.0392156862745098 -0.00362754644214063 33.85
"69" 308 34.85 64 1 0.0311587147030185 0.0196078431372549 -0.0115508715657636 34.85
"70" 309 35.85 48 0 0.0233690360272639 0 -0.0233690360272639 35.85
"71" 310 36.85 20 0 0.00973709834469328 0 -0.00973709834469328 0
"72" 311 37.85 16 0 0.00778967867575463 0 -0.00778967867575463 40
"73" 312 38.85 7 0 0.00340798442064265 0 -0.00340798442064265 39
"74" 313 39.85 1 0 0.000486854917234664 0 -0.000486854917234664 39.9
"75" 314 40.85 0 0 0 0 0 40.85
"76" 315 41.85 0 0 0 0 0 41.85
"77" 316 42.85 0 0 0 0 0 42.85
"78" 317 43.85 0 0 0 0 0 0
"79" 318 44.85 0 0 0 0 0 40
"80" 319 45.85 0 0 0 0 0 46
"81" 320 46.85 0 0 0 0 0 46.9
Rounding can be done via the aptly named round function.
The cut function is made for continuous data, so instead of a group ranging from 0 to 3 and a different one from 4 to 7 we can just cut the continuum of real numbers at -.5, 3.5, 7.5, 11.5, ...
library(magrittr)
unrounded <- c(-12.6, -12.4, -.01, +.01, 12.4, 12.6)
rounded <- unrounded %>% round(digits = 0)
values <- c(1, 2, 4, 7, 10 ,20)
group <- values %>% cut(breaks = seq(-.5, 1000, 4))
It wasn't clear to me what you want to do with values less than zero but here's a tidyverse solution...
library(dplyr)
tempDF <- data.frame(Kelvin = seq(240,320)) %>%
mutate(Celsius = Kelvin - 273.15) %>%
mutate(Celsius_rounded = round(Celsius)) %>%
mutate(Celsius_groups = cut(Celsius_rounded, breaks = seq(-.5, 1000, 4)))
tempDF
#> Kelvin Celsius Celsius_rounded Celsius_groups
#> 1 240 -33.15 -33 <NA>
#> 2 241 -32.15 -32 <NA>
#> 3 242 -31.15 -31 <NA>
#> 4 243 -30.15 -30 <NA>
#> 5 244 -29.15 -29 <NA>
#> 6 245 -28.15 -28 <NA>
#> 7 246 -27.15 -27 <NA>
#> 8 247 -26.15 -26 <NA>
#> 9 248 -25.15 -25 <NA>
#> 10 249 -24.15 -24 <NA>
#> 11 250 -23.15 -23 <NA>
#> 12 251 -22.15 -22 <NA>
#> 13 252 -21.15 -21 <NA>
#> 14 253 -20.15 -20 <NA>
#> 15 254 -19.15 -19 <NA>
#> 16 255 -18.15 -18 <NA>
#> 17 256 -17.15 -17 <NA>
#> 18 257 -16.15 -16 <NA>
#> 19 258 -15.15 -15 <NA>
#> 20 259 -14.15 -14 <NA>
#> 21 260 -13.15 -13 <NA>
#> 22 261 -12.15 -12 <NA>
#> 23 262 -11.15 -11 <NA>
#> 24 263 -10.15 -10 <NA>
#> 25 264 -9.15 -9 <NA>
#> 26 265 -8.15 -8 <NA>
#> 27 266 -7.15 -7 <NA>
#> 28 267 -6.15 -6 <NA>
#> 29 268 -5.15 -5 <NA>
#> 30 269 -4.15 -4 <NA>
#> 31 270 -3.15 -3 <NA>
#> 32 271 -2.15 -2 <NA>
#> 33 272 -1.15 -1 <NA>
#> 34 273 -0.15 0 (-0.5,3.5]
#> 35 274 0.85 1 (-0.5,3.5]
#> 36 275 1.85 2 (-0.5,3.5]
#> 37 276 2.85 3 (-0.5,3.5]
#> 38 277 3.85 4 (3.5,7.5]
#> 39 278 4.85 5 (3.5,7.5]
#> 40 279 5.85 6 (3.5,7.5]
#> 41 280 6.85 7 (3.5,7.5]
#> 42 281 7.85 8 (7.5,11.5]
#> 43 282 8.85 9 (7.5,11.5]
#> 44 283 9.85 10 (7.5,11.5]
#> 45 284 10.85 11 (7.5,11.5]
#> 46 285 11.85 12 (11.5,15.5]
#> 47 286 12.85 13 (11.5,15.5]
#> 48 287 13.85 14 (11.5,15.5]
#> 49 288 14.85 15 (11.5,15.5]
#> 50 289 15.85 16 (15.5,19.5]
#> 51 290 16.85 17 (15.5,19.5]
#> 52 291 17.85 18 (15.5,19.5]
#> 53 292 18.85 19 (15.5,19.5]
#> 54 293 19.85 20 (19.5,23.5]
#> 55 294 20.85 21 (19.5,23.5]
#> 56 295 21.85 22 (19.5,23.5]
#> 57 296 22.85 23 (19.5,23.5]
#> 58 297 23.85 24 (23.5,27.5]
#> 59 298 24.85 25 (23.5,27.5]
#> 60 299 25.85 26 (23.5,27.5]
#> 61 300 26.85 27 (23.5,27.5]
#> 62 301 27.85 28 (27.5,31.5]
#> 63 302 28.85 29 (27.5,31.5]
#> 64 303 29.85 30 (27.5,31.5]
#> 65 304 30.85 31 (27.5,31.5]
#> 66 305 31.85 32 (31.5,35.5]
#> 67 306 32.85 33 (31.5,35.5]
#> 68 307 33.85 34 (31.5,35.5]
#> 69 308 34.85 35 (31.5,35.5]
#> 70 309 35.85 36 (35.5,39.5]
#> 71 310 36.85 37 (35.5,39.5]
#> 72 311 37.85 38 (35.5,39.5]
#> 73 312 38.85 39 (35.5,39.5]
#> 74 313 39.85 40 (39.5,43.5]
#> 75 314 40.85 41 (39.5,43.5]
#> 76 315 41.85 42 (39.5,43.5]
#> 77 316 42.85 43 (39.5,43.5]
#> 78 317 43.85 44 (43.5,47.5]
#> 79 318 44.85 45 (43.5,47.5]
#> 80 319 45.85 46 (43.5,47.5]
#> 81 320 46.85 47 (43.5,47.5]
How can I split one column in multiple columns in R using spaces as separators?
I tried to find an answer for few hours (even days) but now I count on you guys to help me!
This is how my data set looks like and it's all in one column, I don't really care about the column names as in the end I will only need a few of them for my analysis:
[1] 1000.0 246
[2] 970.0 491 -3.3 -5.0 88 2.73 200 4 272.2 279.8 272.7
[3] 909.0 1002 -4.7 -6.6 87 2.58 200 12 275.9 283.2 276.3
[4] 900.0 1080 -5.5 -7.5 86 2.43 200 13 275.8 282.8 276.2
[5] 879.0 1264 -6.5 -8.8 84 2.25 200 16 276.7 283.1 277.0
[6] 850.0 1525 -6.5 -12.5 62 1.73 200 20 279.3 284.4 279.6
Also, I tried the separate function and it give me an error telling me that this is not possible for a function class object.
Thanks a lot for your help!
It's always easier to help if there is minimal reproducible example in the question. The data you show is not easily usable...
MRE:
data_vector <- c("1000.0 246",
"970.0 491 -3.3 -5.0 88 2.73 200 4 272.2 279.8 272.7",
"909.0 1002 -4.7 -6.6 87 2.58 200 12 275.9 283.2 276.3",
"900.0 1080 -5.5 -7.5 86 2.43 200 13 275.8 282.8 276.2",
"879.0 1264 -6.5 -8.8 84 2.25 200 16 276.7 283.1 277.0",
"850.0 1525 -6.5 -12.5 62 1.73 200 20 279.3 284.4 279.6")
And here is a solution using gsub and read.csv:
oo <- read.csv(text=gsub(" +", " ", paste0(data_vector, collapse="\n")), sep=" ", header=FALSE)
Which produces this output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
1 1000 246 NA NA NA NA NA NA NA NA NA
2 970 491 -3.3 -5.0 88 2.73 200 4 272.2 279.8 272.7
3 909 1002 -4.7 -6.6 87 2.58 200 12 275.9 283.2 276.3
4 900 1080 -5.5 -7.5 86 2.43 200 13 275.8 282.8 276.2
5 879 1264 -6.5 -8.8 84 2.25 200 16 276.7 283.1 277.0
6 850 1525 -6.5 -12.5 62 1.73 200 20 279.3 284.4 279.6
The read.table/read.csv would work if we pass it as a character vector
read.table(text = data_vector, header = FALSE, fill = TRUE)
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
#1 1000 246 NA NA NA NA NA NA NA NA NA
#2 970 491 -3.3 -5.0 88 2.73 200 4 272.2 279.8 272.7
#3 909 1002 -4.7 -6.6 87 2.58 200 12 275.9 283.2 276.3
#4 900 1080 -5.5 -7.5 86 2.43 200 13 275.8 282.8 276.2
#5 879 1264 -6.5 -8.8 84 2.25 200 16 276.7 283.1 277.0
#6 850 1525 -6.5 -12.5 62 1.73 200 20 279.3 284.4 279.6
data
data_vector <- c("1000.0 246",
"970.0 491 -3.3 -5.0 88 2.73 200 4 272.2 279.8 272.7",
"909.0 1002 -4.7 -6.6 87 2.58 200 12 275.9 283.2 276.3",
"900.0 1080 -5.5 -7.5 86 2.43 200 13 275.8 282.8 276.2",
"879.0 1264 -6.5 -8.8 84 2.25 200 16 276.7 283.1 277.0",
"850.0 1525 -6.5 -12.5 62 1.73 200 20 279.3 284.4 279.6")
i got problem how to delete several lines in txt file then convert into csv with R because i just want to get the data from txt.
My code cant delete propely because it delete lines which contain the date of the data
Here the code i used
setwd("D:/tugasmaritim/")
FILES <- list.files( pattern = ".txt")
for (i in 1:length(FILES)) {
l <- readLines(FILES[i],skip=4)
l2 <- l[-sapply(grep("</PRE><H3>", l), function(x) seq(x, x + 30))]
l3 <- l2[-sapply(grep("<P>Description", l2), function(x) seq(x, x + 29))]
l4 <- l3[-sapply(grep("<HTML>", l3), function(x) seq(x, x + 3))]
write.csv(l4,row.names=FALSE,file=paste0("D:/tugasmaritim/",sub(".txt","",FILES[i]),".csv"))
}
my data looks like this
<HTML>
<TITLE>University of Wyoming - Radiosonde Data</TITLE>
<LINK REL="StyleSheet" HREF="/resources/select.css" TYPE="text/css">
<BODY BGCOLOR="white">
<H2>96749 WIII Jakarta Observations at 00Z 02 Oct 1995</H2>
<PRE>
-----------------------------------------------------------------------------
PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
hPa m C C % g/kg deg knot K K K
-----------------------------------------------------------------------------
1011.0 8 23.2 22.5 96 17.30 0 0 295.4 345.3 298.5
1000.0 98 23.6 22.4 93 17.39 105 8 296.8 347.1 299.8
977.3 300 24.6 22.1 86 17.49 105 8 299.7 351.0 302.8
976.0 311 24.6 22.1 86 17.50 104 8 299.8 351.2 303.0
950.0 548 23.0 22.0 94 17.87 88 12 300.5 353.2 303.7
944.4 600 22.6 21.8 95 17.73 85 13 300.6 352.9 303.8
925.0 781 21.2 21.0 99 17.25 90 20 301.0 351.9 304.1
918.0 847 20.6 20.6 100 16.95 90 23 301.0 351.0 304.1
912.4 900 20.4 18.6 89 15.00 90 26 301.4 345.7 304.1
897.0 1047 20.0 13.0 64 10.60 90 26 302.4 334.1 304.3
881.2 1200 19.4 11.4 60 9.70 90 26 303.3 332.5 305.1
850.0 1510 18.2 8.2 52 8.09 95 18 305.2 329.9 306.7
845.0 1560 18.0 7.0 49 7.49 91 17 305.5 328.4 306.9
810.0 1920 15.0 9.0 67 8.97 60 11 306.0 333.4 307.7
792.9 2100 14.3 3.1 47 6.06 45 8 307.1 325.9 308.2
765.1 2400 13.1 -6.8 24 3.01 40 8 309.0 318.7 309.5
746.0 2612 12.2 -13.8 15 1.77 38 10 310.3 316.2 310.6
712.0 3000 10.3 -15.0 15 1.69 35 13 312.3 318.1 312.6
700.0 3141 9.6 -15.4 16 1.66 35 13 313.1 318.7 313.4
653.0 3714 6.6 -16.4 18 1.63 32 12 316.0 321.6 316.3
631.0 3995 4.8 -2.2 60 5.19 31 11 317.0 333.9 318.0
615.3 4200 3.1 -3.9 60 4.70 30 11 317.4 332.8 318.3
601.0 4391 1.6 -5.4 60 4.28 20 8 317.8 331.9 318.6
592.9 4500 0.6 -12.0 38 2.59 15 6 317.9 326.6 318.4
588.0 4567 0.0 -16.0 29 1.88 11 6 317.9 324.4 318.3
571.0 4800 -1.2 -18.9 25 1.51 355 5 319.1 324.4 319.4
549.8 5100 -2.8 -22.8 20 1.12 45 6 320.7 324.8 321.0
513.0 5649 -5.7 -29.7 13 0.64 125 10 323.6 326.0 323.8
500.0 5850 -5.1 -30.1 12 0.63 155 11 326.8 329.1 326.9
494.0 5945 -4.9 -29.9 12 0.65 146 11 328.1 330.6 328.3
471.7 6300 -7.4 -32.0 12 0.56 110 13 329.3 331.5 329.4
453.7 6600 -9.6 -33.8 12 0.49 100 14 330.3 332.2 330.4
400.0 7570 -16.5 -39.5 12 0.31 105 14 333.5 334.7 333.5
398.0 7607 -16.9 -39.9 12 0.30 104 14 333.4 334.6 333.5
371.9 8100 -20.4 -42.6 12 0.24 95 16 335.4 336.3 335.4
300.0 9660 -31.3 -51.3 12 0.11 115 18 341.1 341.6 341.2
269.0 10420 -36.3 -55.3 12 0.08 79 20 344.7 345.0 344.7
265.9 10500 -36.9 75 20 344.9 344.9
250.0 10920 -40.3 80 28 346.0 346.0
243.4 11100 -41.8 85 37 346.4 346.4
222.5 11700 -46.9 75 14 347.6 347.6
214.0 11960 -49.1 68 16 348.1 348.1
200.0 12400 -52.7 55 20 349.1 349.1
156.0 13953 -66.1 55 25 352.1 352.1
152.3 14100 -67.2 55 26 352.6 352.6
150.0 14190 -67.9 55 26 352.9 352.9
144.7 14400 -69.6 60 26 353.6 353.6
137.5 14700 -72.0 60 39 354.6 354.6
130.7 15000 -74.3 50 28 355.6 355.6
124.2 15300 -76.7 40 36 356.5 356.5
118.0 15600 -79.1 50 48 357.4 357.4
116.0 15698 -79.9 45 44 357.6 357.6
112.0 15900 -79.1 45 26 362.6 362.6
106.3 16200 -78.0 35 24 370.2 370.2
100.0 16550 -76.7 35 24 379.3 379.3
</PRE><H3>Station information and sounding indices</H3><PRE>
Station identifier: WIII
Station number: 96749
Observation time: 951002/0000
Station latitude: -6.11
Station longitude: 106.65
Station elevation: 8.0
Showalter index: 6.30
Lifted index: -1.91
LIFT computed using virtual temperature: -2.80
SWEAT index: 145.41
K index: 6.50
Cross totals index: 13.30
Vertical totals index: 23.30
Totals totals index: 36.60
Convective Available Potential Energy: 799.02
CAPE using virtual temperature: 1070.13
Convective Inhibition: -26.70
CINS using virtual temperature: -12.88
Equilibrum Level: 202.64
Equilibrum Level using virtual temperature: 202.60
Level of Free Convection: 828.70
LFCT using virtual temperature: 909.19
Bulk Richardson Number: 210.78
Bulk Richardson Number using CAPV: 282.30
Temp [K] of the Lifted Condensation Level: 294.96
Pres [hPa] of the Lifted Condensation Level: 958.67
Mean mixed layer potential temperature: 298.56
Mean mixed layer mixing ratio: 17.50
1000 hPa to 500 hPa thickness: 5752.00
Precipitable water [mm] for entire sounding: 36.31
</PRE>
<H2>96749 WIII Jakarta Observations at 00Z 03 Oct 1995</H2>
<PRE>
-----------------------------------------------------------------------------
PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
hPa m C C % g/kg deg knot K K K
-----------------------------------------------------------------------------
1012.0 8 23.6 22.9 96 17.72 140 2 295.7 346.9 298.9
1000.0 107 24.0 21.6 86 16.54 135 3 297.1 345.2 300.1
990.0 195 24.4 20.3 78 15.39 128 4 298.4 343.4 301.2
945.4 600 22.9 20.2 85 16.00 95 7 300.9 348.0 303.7
925.0 791 22.2 20.1 88 16.29 100 6 302.0 350.3 304.9
913.5 900 21.9 18.2 80 14.63 105 6 302.8 346.3 305.4
911.0 924 21.8 17.8 78 14.28 108 6 302.9 345.4 305.5
850.0 1522 17.4 16.7 96 14.28 175 6 304.4 347.1 307.0
836.0 1665 16.4 16.4 100 14.24 157 7 304.8 347.5 307.4
811.0 1925 15.0 14.7 98 13.14 123 8 305.9 345.6 308.3
795.0 2095 14.2 7.2 63 8.08 101 9 306.8 331.6 308.3
794.5 2100 14.2 7.2 63 8.05 100 9 306.8 331.5 308.3
745.0 2642 10.4 2.4 58 6.14 64 11 308.4 327.6 309.6
736.0 2744 11.0 0.0 47 5.23 57 11 310.2 326.7 311.1
713.8 3000 9.2 5.0 75 7.70 40 12 310.9 335.0 312.4
711.0 3033 9.0 5.6 79 8.08 40 12 311.0 336.2 312.6
700.0 3163 8.6 1.6 61 6.18 40 12 312.0 331.5 313.1
688.5 3300 8.3 -6.0 36 3.57 60 12 313.1 324.8 313.8
678.0 3427 8.0 -13.0 21 2.08 70 12 314.2 321.2 314.6
642.0 3874 5.0 -2.0 61 5.17 108 11 315.7 332.4 316.7
633.0 3989 4.4 -11.6 30 2.50 117 10 316.3 324.7 316.8
616.6 4200 3.1 -14.1 27 2.09 135 10 317.1 324.3 317.6
580.0 4694 0.0 -20.0 21 1.36 164 13 319.1 323.9 319.4
572.3 4800 -0.4 -20.7 20 1.29 170 14 319.9 324.5 320.1
510.8 5700 -4.0 -26.6 15 0.86 80 10 326.1 329.2 326.2
500.0 5870 -4.7 -27.7 15 0.79 80 10 327.2 330.2 327.4
497.0 5917 -4.9 -27.9 15 0.78 71 13 327.6 330.5 327.7
491.7 6000 -5.5 -28.3 15 0.76 55 19 327.9 330.7 328.0
473.0 6300 -7.6 -29.9 15 0.68 55 16 328.9 331.4 329.0
436.0 6930 -12.1 -33.1 16 0.54 77 17 330.9 333.0 331.0
400.0 7580 -17.9 -37.9 16 0.37 100 19 331.6 333.1 331.7
388.3 7800 -19.9 -39.9 15 0.31 105 20 331.8 333.1 331.9
386.0 7844 -20.3 -40.3 15 0.30 103 20 331.9 333.1 331.9
372.0 8117 -18.3 -38.3 16 0.38 91 23 338.1 339.6 338.1
343.6 8700 -22.1 -41.4 16 0.30 65 29 340.7 342.0 340.8
329.0 9018 -24.1 -43.1 16 0.26 73 27 342.2 343.2 342.2
300.0 9680 -29.9 -44.9 22 0.23 90 22 343.1 344.1 343.2
278.6 10200 -34.3 85 37 344.1 344.1
266.9 10500 -36.8 60 32 344.7 344.7
255.8 10800 -39.4 65 27 345.2 345.2
250.0 10960 -40.7 65 27 345.4 345.4
204.0 12300 -51.8 55 23 348.6 348.6
200.0 12430 -52.9 55 23 348.8 348.8
194.6 12600 -55.0 60 23 348.1 348.1
160.7 13800 -70.1 35 39 342.4 342.4
153.2 14100 -73.9 35 41 340.6 340.6
150.0 14230 -75.5 35 41 339.9 339.9
131.5 15000 -76.3 50 53 351.6 351.6
124.9 15300 -76.6 50 57 356.2 356.2
122.0 15436 -76.7 57 45 358.3 358.3
118.6 15600 -77.3 65 31 360.2 360.2
115.0 15779 -77.9 65 31 362.2 362.2
112.6 15900 -77.7 85 17 364.8 364.8
107.0 16200 -77.2 130 10 371.2 371.2
100.0 16590 -76.5 120 18 379.7 379.7
</PRE><H3>Station information and sounding indices</H3><PRE>
Station identifier: WIII
Station number: 96749
Observation time: 951003/0000
Station latitude: -6.11
Station longitude: 106.65
Station elevation: 8.0
Showalter index: -0.58
Lifted index: 0.17
LIFT computed using virtual temperature: -0.57
SWEAT index: 222.41
K index: 31.80
Cross totals index: 21.40
Vertical totals index: 22.10
Totals totals index: 43.50
Convective Available Potential Energy: 268.43
CAPE using virtual temperature: 431.38
Convective Inhibition: -84.04
CINS using virtual temperature: -81.56
Equilibrum Level: 141.42
Equilibrum Level using virtual temperature: 141.35
Level of Free Convection: 784.91
LFCT using virtual temperature: 804.89
Bulk Richardson Number: 221.19
Bulk Richardson Number using CAPV: 355.46
Temp [K] of the Lifted Condensation Level: 293.21
Pres [hPa] of the Lifted Condensation Level: 940.03
Mean mixed layer potential temperature: 298.46
Mean mixed layer mixing ratio: 16.01
1000 hPa to 500 hPa thickness: 5763.00
Precipitable water [mm] for entire sounding: 44.54
and here my data
data
and this is what i want to get
contoh
On the website:
http://naturalstattrick.com/teamtable.php?season=20172018&stype=2&sit=pp&score=all&rate=n&vs=all&loc=B&gpf=82&fd=2017-10-04&td=2018-04-07
the bottom of the page there is an option to download csv. I downloaded the csv file and renamed it Team Season Totals - Natural Stat Trick 2007-2008 5 vs 5 (Counts).csv. I also put the csv file in my directory.
I successfully read in the file using read.csv.
teams <- read.csv(file = "Team Season Totals - Natural Stat Trick 2007-2008 5 vs 5 (Counts).csv", stringsAsFactors = FALSE)
head(teams)
ï.. Team GP TOI W L OTL ROW CF CA CF. FF FA FF. SF SA SF. GF GA GF. SCF SCA SCF. SCGF SCGA SCGF. SCSH.
1 1 Atlanta Thrashers 82 3539.050 34 40 8 25 2638 3512 42.89 2002 2717 42.42 1505 2052 42.31 125 172 42.09 1195 1500 44.34 83 126 39.71 6.95
2 2 Pittsburgh Penguins 82 3435.417 47 27 8 40 2820 3380 45.48 2192 2542 46.30 1580 1812 46.58 142 122 53.79 1343 1374 49.43 112 90 55.45 8.34
3 3 Los Angeles Kings 82 3502.333 32 43 7 27 3008 3576 45.69 2306 2787 45.28 1649 1961 45.68 137 174 44.05 1049 1286 44.93 63 80 44.06 6.01
4 4 Montreal Canadiens 82 3475.183 47 25 10 42 3089 3601 46.17 2266 2603 46.54 1617 1863 46.47 144 138 51.06 1156 1221 48.63 62 61 50.41 5.36
5 5 Edmonton Oilers 82 3442.633 41 35 6 26 2958 3424 46.35 2255 2585 46.59 1601 1830 46.66 143 166 46.28 1334 1398 48.83 104 116 47.27 7.80
6 6 Philadelphia Flyers 82 3374.800 42 29 11 39 2902 3343 46.47 2188 2505 46.62 1609 1857 46.42 125 137 47.71 919 1028 47.20 61 68 47.29 6.64
SCSV. HDCF HDCA HDCF. HDGF HDGA HDGF. HDSH. HDSV. SH. SV. PDO
1 91.60 388 468 45.33 51 82 38.35 13.14 82.48 8.31 91.62 0.999
2 93.45 503 444 53.12 79 49 61.72 15.71 88.96 8.99 93.27 1.023
3 93.78 270 356 43.13 29 36 44.62 10.74 89.89 8.31 91.13 0.994
4 95.00 271 322 45.70 25 31 44.64 9.23 90.37 8.91 92.59 1.015
5 91.70 443 452 49.50 57 61 48.31 12.87 86.50 8.93 90.93 0.999
6 93.39 257 266 49.14 24 24 50.00 9.34 90.98 7.77 92.62 1.004
The one thing I noticed was the Team Column had a accent in it:
teams$Team
[1] "Atlanta Thrashers" "Pittsburgh Penguins" "Los Angeles Kings" "Montreal Canadiens" "Edmonton Oilers" "Philadelphia Flyers"
[7] "St Louis Blues" "Colorado Avalanche" "Vancouver Canucks" "Minnesota Wild" "Florida Panthers" "Phoenix Coyotes"
[13] "Tampa Bay Lightning" "Buffalo Sabres" "Chicago Blackhawks" "New York Islanders" "Nashville Predators" "Anaheim Ducks"
[19] "Boston Bruins" "Ottawa Senators" "Dallas Stars" "Toronto Maple Leafs" "Carolina Hurricanes" "Columbus Blue Jackets"
[25] "New Jersey Devils" "Calgary Flames" "San Jose Sharks" "New York Rangers" "Washington Capitals" "Detroit Red Wings"
Removing the accent:
teams$Team <- sub(pattern = "Â", replacement = "", teams$Team)
teams$Team[1]
[1] "Atlanta Thrashers"
Now when I want to subset the data based on Team, all the values come back FALSE:
teams$Team[1]
[1] "Atlanta Thrashers"
teams$Team[1] == "Atlanta Thrashers"
[1] FALSE
dplyr::filter(teams, Team == "Atlanta Thrashers")
[1] ï.. Team GP TOI W L OTL ROW CF CA CF. FF FA FF. SF SA SF. GF GA GF. SCF SCA SCF. SCGF SCGA
[26] SCGF. SCSH. SCSV. HDCF HDCA HDCF. HDGF HDGA HDGF. HDSH. HDSV. SH. SV. PDO
<0 rows> (or 0-length row.names)
It comes back FALSE for every team and I don't understand why? Something with the accent that I removed? Does it have to do something with encoding, i.e., utf-8? If someone could please assist me I would appreciate it. Thanks.
I figured it out. I had to do with the accent. I used:
iconv(teams$Team,, "UTF-8", "UTF-8",sub=' ')
iconv(teams$Team, "UTF-8", "UTF-8",sub=' ')[1] == "Atlanta Thrashers"
[1] TRUE
I never had that happen to me and have no experience with encoding and utf-8.