HTK ERROR [+5010] InitSource: Cannot open source file f-ihm+k - htk

I believe that this error has something to do with a mismatch between my tiedlist and the hmmdefs (as pointed out here:http://www.ling.ohio-state.edu/~bromberg/htk_problems.html), but I can not seem to solve it. All of the triphones in my corpus are present in my triphones1 list and triphones1 only contains monophones,biphones and triphones from my corpus.
If I take said triphone out of the triphones1 list and recreate the tiedlist it passes but complains about another triphone down the road. Obviously manually taking out all of these triphones would take me years and it doesn't seem efficient which leads me to believe that I have missed something further back.
It is also important to note that all these triphones generating errors are in my corpus as well. To me this error would only make sense if I had unseen triphones somewhere, but where? I feel that I have left no stone unturned but surely someone can give me a fresh idea of where to look.

There was an extra AU command at the end of the tree.hed file This was causing it to try and open another file after the tiedlist. I am not sure why this causes an issue when it has already accessed tiedlist, but there you go.
Hopefully this serves as a extra check for future htk users.

Related

How to use syntax highlighting in next js?

I am able to parse the markdwon with the help of remark and remark-html. How can I add syntax highlighting features for the code element?
I've been searched for TWO whole days trying to use remark/rehype ecosystem on syntax highlighting which is way more complex to use. I'm still working on searching, it's better to avoid remark/rehype ecosystem and try another method.
Here I share list of my search, just give you some subjective perspective so that you won't waste you too much time on things that not working. None of them will work as you expected, and their sample code are obscure or just broken codes.
https://github.com/remarkjs/remark-highlight.js
They move to support rehype. Are you a big fan of them wanna move as well?
https://github.com/sergioramos/remark-prism
You will get this error: Module parse failed: Unexpected character '�'
https://github.com/torchlight-api/remark-torchlight
They states on their website: 🚨 This client is still very much a work in progress. Please open issues! 🚨View it on github at github.com/torchlight-api/remark and this link is invalid.
https://github.com/rehypejs/rehype-highlight
It force you to use rehpye althought you can you remark-rehype for you to transfer easily. However, debug your code once you encounter some error will be helpless.
sample code use third-pary v-file to read file showing their code is working but not. and it reads html file not markdown file.
It's hard for you to grasp their concept to use their plugins to easily. It waste most of you valuable time to think what they think.
I give up remark/rehype ecosystem and stay away from them. Good luck! :)

write_csv() produces bank cells in msexcel

This isn't a major issue, but I still thought I would ask.
I've been cleaning some data for a project at work, and there's a point at the process where I save all of the individual files I've cleaned as a CSV in long format. I noticed that with some of the files that if I open them, some cells that SHOULD have data appear blank. If I use the "Clear All Formats" option, the data appears. It reads into R just fine and it hasn't caused any issues, but I still think it's weird.
Has anyone else run into this and if so, was there a way to resolve this without going through each column? The files I'm cleaning start out with all sorts of formatting, so I'm curious if that could be the cause. I thought that a CSV doesn't save formats though, so I'm a little confused.
Again, not the biggest deal but slightly annoying and I'll get questions about it if my colleagues ever take a look at these files.
The data is prorietary, and I'm not exactly sure how I would share it. but I'm using a pretty stragith forward write_csv(data,"path.csv")
I think I figured out the solution to this issue, and I wanted to share in case anyone else runs into this.
I'm using a Windows Computer, which needed an update. That got me thinking and I needed to update my version of RStudio. I'm not sure what would have caused this issue, but when I re-run those files, the issue appears to be resolved.

Trying to convert JSON url using the fromJSON() function, but it is not working (never finishes running)

My code:
library(jsonlite)
URL = "https://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0021500431&RangeType=2&StartPeriod=1&StartRange=0"
the.data.file <- fromJSON(URL)
Simple, right? However, the code never stops running. No error message pops up, it just goes on forever. I thought maybe it just takes some time, but it's been going on for a long time. Maybe that's normal, and let me know if it is, but I don't think that's the case.
Self-Answer (not sure if I'm supposed to do this?):
I did more testing with the fromJSON() function and found out that it works fine with other URLs. So I wondered if the problem was with stats.nba.com and looked up this problem specific to stats.nba.com, and sure enough I found other people asking the same thing. The solution that worked for me is downloading the file similar to this:
library(jsonlite)
curl_download("http://stats.nba.com/stats/teamgamelog?LeagueID=00&Season=2016-17&SeasonType=Regular+Season&teamid=1610612761", "nba.json")
jsonlist<-fromJSON( "nba.json")
df<-as.data.frame(jsonlist$resultSets$rowSet)
names(df)<-jsonlist$resultSets$headers[[1]]
parameters<-jsonlist$parameters
I don't want to take credit for this because I found it from another user's answer here. I'm just putting it here in case somebody in the future finds it.

DTD parsing error in R

I've got a bit of a problem with an xml tree in r. I have a treebank, containing the corpus - stuff I really need. What I want is to take the XML files, parse them with the help of the DTD on my computer, and then just create a corpus afterwards.
So far I've tried
xmlTreeParse(doc, options=XML::DTDLOAD)
and
xmlParse(doc)
and also
parseDTD(dtd)
but all of them throw back an error. First two still say "entity not defined", and the parsing function gives back "failed to load external entity "yaddayadda.dtd"". In this question the treeparse function was given as an answer, but it does not work for me. The xml files have a SYSTEM "../yaddayadda.dtd" designation.
What I plan to do with this, is to somehow create a VCorpus object in the tm package from the parsed text, to use it in later textmining research.
Could you help me please? Will provide further details if needed.
The parser, which you are telling to load the DTD, is seeing a reference to "../yaddayadda.dtd" and not finding it.
The most likely cause is that you have no file named "yaddayadda.dtd" on the appropriate file system, or that you have it in the wrong place; the parser should be looking for it in the directory one level up from the XML document which refers to it.
If you have it in what you think is the right location, then apparently you and the parser do not agree on what the right location is. Good luck.

ACORD AL3 - What's the deal with "?"s

We're writing a parser for ACORD AL3. Read AL3 coming in, write AL3 going out. Nice and simple.
As of right now, it is 99% solid. The only thing that's driving me nuts is the use of "?"s in the ACORD AL3 standard. It appears that they are used as placeholders for fields that do not have values in the message. HOWEVER, that's not the only rule for it, because if it was, the AL3 I'm currently generating would look that the sample files I'm trying to have it match.
So if anyone here knows anything about the rules around AL3 "?"s, that would be great. I've been pouring over the Data Dictionary and the other documentation from ACORD, and I'm seeing nothing to indicate which fields get it, and which ones don't.
Also, if the "?"s are not required for AL3 processing to begin with, that would also be great to know, because then I could just stop worrying about the whole thing.
In the ACORD AL3 standard, from what I recall, if you use a "?" in one of the fields, this tells the receiving system to not overwrite (with blanks) the target field in the user's management system.
There may be individual elements in a group that are valid, but the sender
cannot send them for some reason. The solution is to fill that data element with questions marks (?????) The receiving system will recognize this and not update that field on their system.
In ACORD Al3 "?" means there is not any data in that specific element but that element is much important to maintain the hierarchy. But there is one thing, Coverage groups and Transaction groups will not contains any question mark. that does not means these are not important even these are very much important in Al3 files. But that above mentioned description applies for data group.
Secondly number of question marks in element describe its length.
If anyone need more details related to al3 data, don't hesitate to ask.

Resources