Problem Getting Data from bsrch Using Rblpapi - r

I can use bsrch in excel to get values given of "COMDTY:WEATHER" however when I try it using Rblpapi it shows empty data. Also below I have the Excel formulas so you can see how I am trying to get it to work in R below
Observations Basic Query
=BSRCH("comdty:weather","provider=wsi","location=KNYC","model=ACTUALS","cols=15;rows=354")
Observations Specify Parameters
=BSRCH("comdty:weather","provider=wsi","location=KNYC","model=ACTUALS","fields=WIND_SPEED|TEMPERATURE","cols=3;rows=358")
I tried to look for examples online and only found the below, and this code works.
#this code works
library(Rblpapi)
blpConnect()
head(bsrch("COMDTY:NGFLOW"), 20)
head(bsrch("COMDTY:VESSEL"), 20)
#this is my code and it doesn't work
head(bsrch("COMDTY:WEATHER"), 20)

In order to pass in the overrides, requests need to be modified. In this particular request type, the name and value element of values of Overrides array element needs to be modified. In their source code this is not implemented yet, they have a TODO item below.
// TODO - implement limit and other overrides
Hopefully this will be introduced in future versions. Or you can implement yourself.

Related

SpotifyR get_artist_audio_features() doesn't filter by market, is_uri() not found when editing

I'm trying to obtain audio features of an artist's tracks with spotifyr:
test <- get_artist_audio_features("california honeydrops", include_groups = c("album", "single", "appears_on", "compilation"), market = "US")
A quick check of the results reveals several repeats of the same track and album names with slightly different audio features, and unique(test$available_markets reveals that these duplications are because the function did not properly filter by market = "US". Replacing "US" with other country codes yields the same result. However, if include_groups is left as the default, which only returns tracks from albums, then the market filter does work as expected.
I thought I might make a quick fix by editing the source code for get_artist_audio_features() to force market = "US" in RStudio, but I get an error when copy-pasting and then trying to run the original function's code because R insists one of the functions used to make get_artist_audio_features(), spotify::is_uri(), is not part of the spotifyr package. However, it can be found in the package's help section, is part of the original function, and works fine when calling the original function.
Of course, I can filter these duplicates out after the fact, but for edification's sake, what gives? Can anyone provide a fix to the original function or explain why R can't find the is_uri() when I try to run a copy of the original function?

Validate data from website before downloading in R

I have a bunch of weather data files I want to download, but there's a mix of website url's that have data and those that don't. I'm using the download.file function in R to download the text files, which is working fine, but I'm also downloading a lot of empty text files because all the url's are valid, even if no data is present.
For example, this url provides good data.
http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2021&MONTH=12&FROM=3000&TO=3000&STNM=72645
But this one doesn't.
http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=1970&MONTH=12&FROM=3000&TO=3000&STNM=72645
Is there a way to check to see if there's valid data in the text file before I download it? I've looked for something in the RCurl package, but didn't see what I needed. Thank you.
You can use httr::HEAD to determine the data size before downloading it. Note that this saves you the "pain" of downloading; if there is any cost on the server side, it feels the query-pain even if you do not download it. (These two seem quick enough, perhaps it's not a problem.)
# good data
res1 <- httr::HEAD("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2021&MONTH=12&FROM=3000&TO=3000&STNM=72645")
httr::headers(res1)$`content-length`
# [1] "9435"
# no data
res2 <- httr::HEAD("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=1970&MONTH=12&FROM=3000&TO=3000&STNM=72645")
httr::headers(res2)$`content-length`
# NULL
If the API provides a function for estimating size (or at least presence of data), then it might be nicer to the remote end to use that instead of using this technique. For example: let's assume that an API call requires a 20 second SQL query. A call to HEAD will take 20 seconds, just like a call to GET, the only difference being that you don't get the data. If you see that you will get data and then subsequently call httr::GET(.), then you'll wait another 20 seconds (unless the remote end is caching queries).
Alternatively, they may have a heuristic to find presence of data, perhaps just a simple yes/no, that only takes a few seconds. In that case, it would be much "nicer" for you to make a 3 second "is data present" API call before calling the 20-second full query call.
Bottom line: if the API has a "data size" estimator, use it, otherwise HEAD should work fine.
As an alternative to HEAD, just GET the data, check the content-length, and save to file only if found:
res1 <- httr::GET("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2021&MONTH=12&FROM=3000&TO=3000&STNM=72645")
stuff <- as.character(httr::content(res1))
if (!is.null(httr::headers(res1)$`content-length`)) {
writeLines(stuff, "somefile.html")
}
# or do something else with the results, in-memory

How to modify the DataLabel fill (using a solid fill color)?

I'm trying to edit the data labels for a chart I'm writing on the slide. I can access the text of the datalabel using the methods available but as of yet the datalabel.fill is not existent. Any workarounds welcome if this is not planned on being added to the library in the future.
I've already gone through the source code in the github (https://github.com/scanny/python-pptx) but the datalabel class only has the font, has_text_frame, position, text_frame, _dLbl, _get_or_add_dLbl, _get_or_add_rich, _get_or_add_tx_rich, _get_or_add_txPr, and _remove_tx_rich methods. No fill or line fill methods is available.
The script I'm running does something similar for cells in a table:
cell.fill.solid()
cell.fill.fore_color.rgb = color_list[((col>0)*1)][i%2]
I'm looking at replicating the functionality on datalabels for chart series, with code that looks like this:
label.fill.solid()
label.fill.rgb = RGBColor(0x9B,0xBB,0x59)
label.fill.alpha = Alpha(.2)
label.line.fill.solid()
label.line.rgb = RGBColor(0xF0,0xF0,0x00)
The expected output xml should put the following for data labels:
<c:spPr>
<a:solidFill>
<a:srgbClr val="9BBB59">
<a:alpha val="80000"/>
</a:srgbClr>
</a:solidFill>
<a:ln>
<a:solidFill>
<a:schemeClr val="F0F000"/>
</a:solidFill>
</a:ln>
</c:spPr>
Actual output is non-existent as there is no method to do this directly.
This feature is not yet implemented, but if it was implemented it would be a .format property on the DataLabel object.
Typically users will work around an API gap like this by adding what we typically call a "workaround function" to the client code that manipulates the underlying XML directly, in this case, to add an <c:spPr> subtree in the right place.
python-pptx can generally get you close as far as a parent element is concerned. In this case, it can get you to the <c:dLbl> element like this:
data_label = chart.series[0].points[0].data_label
dLbl = data_label._dLbl
print(dLbl.xml)
The leading underscore in ._dLbl is your hint that you're getting into internals and if things don't go well it's something you're doing wrong, not an issue to be reported.
The dLbl object is an lxml.etree._Element object and can be manipulated with that API. If you have a search on "python-pptx workaround function" you'll find some examples for how to create new elements and put them where you want them.
The .xml property available on any python-pptx XML element object is handy for inspecting the results along the way. opc-diag can also be handy for inspecting PPTX files generated by PowerPoint or python-pptx for analysis or diagnostic purposes.

Customize existing function in R

I want to change a condition within the function psych::polychoric in R.
Specifically, I want to increase the limit of different realizations of a a variable from 8 to 10 on line 77 of the code.
I can manually increase the limit by calling
trace(polychoric, edit=TRUE)
Since the script is meant for reproduction purposes for a paper of mine, I want to make handling as smooth as possible by avoiding manual editing.
Is there a way to edit the function by a piece code,
e.g. by replacing if (nvalues > 8) by if (nvalues > 10) in the code by another function?
Any suggestions would be much appreciated.
find the location in the function that you want to change
as.list(body(psych::polychoric))
Change the function
trace(psych::polychoric, quote(nvalues > 10), at=11)
Check to see that you changed what you want to change
trace(psych::polychoric, edit=TRUE)
Set the function back to original
untrace(psych::polychoric)
-----
Seems like fix may be easier for you to implement for this task
fix(polychoric)
opens a pane that you can change the code in - change and hit save.
This will make the function local to your global environment you can check this by looking at the original function trace(polychoric, edit = T) will show nvalues > 10, and trace(psych::polychoric, edit = T) will show nvalues > 8. The next time you reload psych you will be using the original function. Bit of a manual hack - but hopefully works for this one off situation.

R scripting in SPSS Modeler 16: change default "rowCount=1000" for modelerData

When applying R transform Field operation node in SPSS Modeler, for every script, the system will automatically add the following code on the top of my own script to interface with the R Add-on:
while(ibmspsscfdata.HasMoreData()){
modelerDataModel <- ibmspsscfdatamodel.GetDataModel()
modelerData <- ibmspsscfdata.GetData(rowCount=1000,missing=NA,rDate="None",logicalFields=FALSE)
Please note "rowCount=1000". When I process a table with >1000 rows (which is very normal), errors occur.
Looking for a way to change the default setting or any way to help to process table >1000 rows!
I've tried to add this at the beggining of my code and it works just fine:
while(ibmspsscfdata.HasMoreData())
{
modelerData <-rbind(modelerData,ibmspsscfdata.GetData(rowCount=1000,missing=NA,rDate="None",logicalFields=FALSE))
}
Note that you will consume a lot of memory with "big data" and parameters of .GetData() function should be set accordingly to "Read Data Options" in node setting.

Resources