Meaning of Kusto CSL acronym - azure-data-explorer

This question is pretty straightforward: relating to Microsoft Kusto, what does the acronym CSL stand for? I know what CSL is used for, but there seems to be a distinct lack of information about what those three letters actually represent.

CSL = Cousteau Semantic Language.
Cousteau was later renamed into Kusto (sounds the same but simpler to write for non-French speakers :)).
However CSL is deprecated, and users are encouraged to use KQL (Kusto Query Language) instead (although both work).

Related

why underscores are not recommended for variable names in Julia?

I read in the Julia doc page https://docs.julialang.org/en/v1/manual/variables/#:~:text=Variable%20names%20must%20begin%20with,Sm%20math%20symbols)%20are%20allowed. :
Word separation can be indicated by underscores ('_'), but use of
underscores is discouraged unless the name would be hard to read
otherwise
My question is if there any reasons to discourage the usage of underscores? Thanks.
I don't think underscores are really discouraged in user code and for internal variables. It is mostly for being consistent with the style in Base Julia, which follows this, mostly. And consistency is good, right?
But if you create a package or module, then the interface normally consists of types and functions. Typenames have strong convetion that they should be CapitalCase. User-facing functions are normally lowercase without _, because they are supposed to be simple, brief and should express a single well-defined concept. A bit like the Unix philospophy: every function should do one thing, and do it well.
A convention discouraging composite and long identifier names encourages you to create simple functions. If your function needs a name with underscores, it's possibly a sign that you should break it into multiple functions.
But in your own code, use whatever convension that suits you.
I’m no expert on Julia, but the line you quote is located under the header “Stylistic Conventions” and I would presume that’s basically it.
There is an additional section about naming conventions in the docs under Style Guide
There is a line in there that says:
“Underscores are also used to indicate a combination of concepts”.
So if you decided to use a lot of underscores in your function names, the next programmer to work on your code might think you are “combining concepts”.

Localized soundex() database query

The SOUNDEX() function checks whether words are similar in sound and allows for misspellings to be included in a select.
How I understand it though it that it uses English as a basis for the word sounds it generates, in other languages this can lead to mismatches due to words/vowels etc.. being pronounced differently.
Is there a way to have SOUNDEX() behave with for example German or Dutch or any other language? Or am I misunderstanding how it works and will it work just as properly as with English?

Gremlin text comparison predicates

My vertice properties partly contain objects like e.g. a File object.
When searching with "has" I would like to search for Files by path. I think some kind of text comparison predicates that would do the "toString()" conversion might be helpful here.
Are there any standard predicates like this in gremlin/tinkerpop or do I have to implement these my self?
I found two related questions in stackoverflow:
Gremlin.net textContains equivalent
How Gremlin query same sql like for search feature
And one of them I answered today with a pointer to the SimpleGraph project's RegexPredicate implementation https://github.com/BITPlan/com.bitplan.simplegraph/blob/master/simplegraph-core/src/main/java/com/bitplan/gremlin/RegexPredicate.java
(I am one of the committer of that project)
Currently I'd proceed by adding more helper Predicates like that to the library.
For now you should add your own predicates for text comparisons. TinkerPop has had discussion in the past about adding such support, but no consensus has been achieved on a direction to take.

Documentation tool for Ada software

Brief: I'm looking for some kind of tool to produce a software description from the comments in existing software source code.
In more detail: I've got existing source code written in Ada. Changes need to be made to this source code and I also need to generate a document containing a description of the software as a whole and all of its packages, routines etc. (if possible as PDF). For the existing routines these source code comments already exist and contain sufficient detail for my needs.
The description shall include at least
overall software design
textual description of packages, routines, variables, constants etc.
call and caller graphs
For projects based on C I'd do this using Doxygen. Doxygen itself, however, does not cope with sotware written in Ada. My thought was to (automatically) convert existing comments in the source code so that Doxygen can read these. The conversion itself was no problem (using Doxygen's filter mechanism), but as keywords and syntax between C and Ada differ a lot, this did not produce any useable output.
I then had a look at Understand from SciTools. While this analyses the software to a good detail and generates nice metrices, I was not able to get anything out of it, that resembles a document with what I need.
I want to avoid (manually) writing a separate document, but instead would like to generate this from the code base. I will have to put all the necessary information (perhaps with the the exception of a general overview) there anyhow, so why not use it for documentation purpose as well.
Is there any tool that is able to do what I need?
There's a tool called "AdaDoc", which seems to do a part of what you're asking for. You can of course use "a2ps" for the textual part of your needs (I like that better than what AdaDoc generates).
There are several UML tools ("Umbrello" is one name I remember), which offer to create graphs of inter-package relations, but for a seriously sized project, the best option is to use the original design documents, and simply verify that the source text actually matches that design.
For languages not supported by Doxygen, I've written my own "general purpose" filter.
It's very basic, but useful for me.
https://github.com/malkev/doxphp

Bilingual (English and Portuguese) documentation in an R package

I am writing a package to facilitate importing Brazilian socio-economic microdata sets (Census, PNAD, etc).
I foresee two distinct groups of users of the package:
Users in Brazil, who may feel more at ease with the documentation in
Portuguese. The probably can understand English to some extent, but a
foreign language would probably make the package feel less
"ergonomic".
The broader international users community, from whom English
documentation may be a necessary condition.
Is it possible to write a package in a way that the documentation is "bilingual" (English and Portuguese), and that the language shown to the user will depend on their country/language settings?
Also,
Is that doable within the roxygen2 documentation framework?
I realise there is a tradeoff of making the package more user-friendly by making it bilingual vs. the increased complexity and difficulty to maintain. General comments on this tradeoff from previous expirience are also welcome.
EDIT: following the comment's suggestion I cross-posted r-package-devel mailling list. HERE, then follow the answers at the bottom. Duncan Murdoch posted an interesting answer covering some of what #Brandons answer (bellow) covers, but also including two additional suggestions that I think are useful:
have the package in one language, but the vignettes for different
languages. I will follow this advice.
have to versions of the package , let's say 1.1 and 1.2, one on each
language
According to Ropensci, there is no standard mechanism for translating package documentation into non-English languages. They describe the typical process of internationalization/localization as follows:
To create non-English documentation requires manual creation of
supplemental .Rd files or package vignettes.
Packages supplying
non-English documentation should include a Language field in the
DESCRIPTION file.
And some more info on the Language field:
A ‘Language’ field can be used to indicate if the package
documentation is not in English: this should be a comma-separated list
of standard (not private use or grandfathered) IETF language tags as
currently defined by RFC 5646 (https://www.rfc-editor.org/rfc/rfc5646,
see also https://en.wikipedia.org/wiki/IETF_language_tag), i.e., use
language subtags which in essence are 2-letter ISO 639-1
(https://en.wikipedia.org/wiki/ISO_639-1) or 3-letter ISO 639-3
(https://en.wikipedia.org/wiki/ISO_639-3) language codes.
Care is needed if your package contains non-ASCII text, and in particular if it is intended to be used in more than one locale. It is possible to mark the encoding used in the DESCRIPTION file and in .Rd files.
Regarding encoding...
First, consider carefully if you really need non-ASCII text. Many
users of R will only be able to view correctly text in their native
language group (e.g. Western European, Eastern European, Simplified
Chinese) and ASCII.72. Other characters may not be rendered at all,
rendered incorrectly, or cause your R code to give an error. For .Rd
documentation, marking the encoding and including ASCII
transliterations is likely to do a reasonable job. The set of
characters which is commonly supported is wider than it used to be
around 2000, but non-Latin alphabets (Greek, Russian, Georgian, …) are
still often problematic and those with double-width characters
(Chinese, Japanese, Korean) often need specialist fonts to render
correctly.
On a related note, R does, however, provide support for "errors and warnings" in different languages - "There are mechanisms to translate the R- and C-level error and warning messages. There are only available if R is compiled with NLS support (which is requested by configure option --enable-nls, the default)."
Besides bilingual documentation, please allow me the following comment: Given your two "target" groups, it may be assumed that some of your users will be running non-English OS (typically, Windows in Portuguese). When importing time series data (or any date entries as a matter of fact), due to different "date" formatting (English vs. non-English), you may get different "results" (i.e. misinterpeted date entries) when importing to English/non-English machines. I have some experience with those issues (I often work with Czech-language-based OSs) and -other than ad-hoc coding- I don't find a simple solution.
(If you find this off-topic, please feel free to delete)

Resources