Mass Thunderbird folder to Gnus nnfolder conversions

Mass Thunderbird folder to Gnus nnfolder conversions - directory

I'm pondering the idea of importing a few thousand Thunderbird folders, each folder containing many emails of course, as a set of Emacs' Gnus mailgroups. Each mailgroup name would be derived from the folder hierarchy. Because of the quantity, the work is going to be fairly tedious, so I would automate this massive import if possible.
Among the available backends, nnfolder seems the most promising in this case. I presume it would be better to populate the mailgroups from within Gnus. Otherwise, I would have to thoroughly understand the nnfolder format, and this might require many iterations before I really get it right. Moreover, as email continues to flow in, iterations may become difficult to properly organize without loosing anything.
I guess I have to respool everything, under the constraint that the selected mailgroup is a function of the Thunderbird origin, overriding the standard Gnus selection mechanism. I did some Gnus coding in the past, but since I did not touch Emacs for a dozen years, it is all very rusty. I'm a bit lost about how to approach this task as efficiently and quickly as possible. So my question: how would you handle it? Or is there some clever Gnus hidden corner that I should explore more deeply? :-)
François
P.S. After I wrote this question, I found out that Gnus has a nice, helping function towards this goal. The idea is to first copy all Thunderbird folder files within the ~/Mail directory, as they are for the contents, but properly renamed. Once this done, M-x nnfolder-generate-active-file does at once, for each copied folder, edit the contents, leave a ~ backup, generate NOV data, create one mailgroup and, of course, adjust the ~/Mail/active file.
To copy the folders underneath the ~/.thunderbird/LOGIN/Mail/Local Folders/ directory, I wrote a small Python script. It ignores all .msf files, and recurse within .sbd directories. The folder path name, relative to Local Folders/, has all its .sbd/ strings turned into periods to produce the mailgroup name, also lowering case, turning spaces and underlines to dashes, and handling other special characters appropriately. In particular, non-ASCII characters are not handled properly, nnfolder is confusing UTF-8 and ISO-8859-1 here and there. The script also has to skip msgfilterrules.dat and likely drafts, junk and such things.
I notice two details requiring attention :
Thunderbird itself can be used to compact folders before copying them, otherwise one might unwillingly recover messages which were already deleted.
(setq nnmail-use-long-file-names t) is needed in ~/.emacs prior to the whole operation.
The batch transformation aborted, saying it is not able to decrypt one of the message. I moved the offending folder out of the way, and then, the lengthy operation succeeded.

Related

Standardized filenames when passing folders between steps in pipeline architecture?

I am using AzureML pipelines, where the interface between pipeline steps is through a folder or file.
When I am passing data into the pipeline, I point directly to a single file. No problem at all. Very useful when passing in configuration files which all live in the same folder on my local computer.
However, when passing data between different steps of the pipeline, I can't provide the next step with a file path. All the steps get is a path to some folder that they can write to. Then that same path is passed to the next step.
The problem comes when the following step is then supposed to load something from the folder.
Which filename is it supposed to try to load?
Approaches I've considered:
Use a standardized filename for everything. Problem is that I want to be able to run the steps locally too, independant of any pipeline. This makes very for a very poor UX for that use case.
Check if the path is to a file, if it isn't, check all the files in the folder. If there is only one file, then use it. Otherwise throw an exception. This is maybe the most elegant solution from a UX perspective, but it sounds overengineered to me. We also don't structurally share any code between the steps at the moment, so either we will have repetition or we will need to find some way to share code, which is non-trivial.
Allow custom filenames to be passed in optionally, otherwise use a standard filename. This helpes with the UX, but often the filenames are supposed to be defined by the configuration files being passed in, so while we could do some bash scripting to get the filename into the command, it feels like a sub-par solution.
Ultimately it feels like none of the solutions I have come up with are any good.
It feels like we are making things more difficult for ourselves in the future if we assume some default filename. F.x. we work with multiple file types, so it would need to omit an extension.
But any way to do it without default filenames would also cause maintainence headache down the line, or incurr substantial upfront cost.
The question is am I missing something? Any potential traps, better solutions, etc. would be appreciated. It definately feels like I am somewhat under- and/or overthinking this.

How do you use guards(1) with quilt(1)

One of the ancillary tools bundled with quilt is guards, which processes a list of guards and a configuration file matching guards and files, and outputs a list of files whose guard specifications match the provided guards.
However I can't figure how they're supposed to fit together: quilt(1) doesn't show any way to invoke a command to generate series files, I didn't find examples in the mans or the working copy, and the internets are less than helpful (all the hits talk about bedding).
I feel like guard has to be manually invoked whenever its "dependencies" change and the series file overwritten, is that the case? If so, how is data fed back the other way around e.g. when adding a new patch to the series, does it have to be manually synchronised to the guard file?
Background: a few years back I used mq quite a bit, but it integrates guards natively so the synchronisation back and forth is not an issue at all.

Static data storage on server-side

Why some data on server-side are still stored in DBC files, not in SQL-DB? In particular - spells (spells.dbc). What for?
We have a lot of bugs in spells and it's very hard to understand what's wrong with spell, but it's harder to find it spell...

Spells, Talents, achievements, etc... Are mostly found in DBC files because that is the way Blizzard did it back in the day. It's true that in 2019 this is a pretty outdated way to work indeed. Databases are getting stronger and more versatile and having hard-coded data is proving to be hard to work with. Hell, DBCs aren't really that heavy anyways and the reason why we haven't made this change yet is that... We have no other reason other than it being a task that takes a bit of time and It is monotonous to do.
We are aware that Trinity core has already made this change but they have far more contributors than we do if that serves as an excuse!
Nonetheless, this is already in our to-do list if you check the issue tracker at the main repository.
While It's true that we can't really edit DBC files because we would lose all the progress when re-extracted or lost the files, however, we can modify spells in a C++ file called SpellMgr.
There we have a function called SpellMgr::LoadDbcDataCorrections().
The main problem while doing this change is that we have to modify the core to support this change, and the function above contains a lot of corrections. Would need intense testing to make sure nothing is screwed up in the process.
In here by altering bits you can remove or add certain properties to the desired spells instead of touching the hard coded dbc files.
If you want an example, in this link, I have changed an Archimonde spell to have no cast time.
NOTE:
In this line, the commentary about damage can be miss leading but that's because I made a mistake and I haven't finished this pull request yet as of 18/04/2019.

The work has been started, notably by Kaev. I think at least 3 DBCs are now useless server side (but probably still needed client side, they are called DataBaseClient for a reason) like item.dbc.
Also, the original philosophy (for ALL cores, not just AC) was that we would not touch DBC because we don't do custom modifications, so there was no interest in having them server side.
But we wanted to change this and started to make them available directly in the DB, if you wish to help with that, it would be nice!

Why?
Because when emulation started, dbc fields were 90% unknown. So, developers created a parser for them that just required few code changes to support new fields as soon as their functionality was discovered.
Now that we've discovered 90% of required dbc fields and we've also created some great conversion tools for DBC<->SQL, it's just a matter of "effort".
SQL conversion is useful to avoid using of client data on server (you can totally overwrite them if you don't want to go against EULA) or just extends/customize them.
Here you are the issue about DBC->SQL conversion: https://github.com/azerothcore/azerothcore-wotlk/issues/584

Ada `Gprbuild` Shorter File Names, Organized into Directories

Over the past few weeks I have been getting into Ada, for various different reasons. But there is no doubt that information regarding my personal reasons as to why I'm using Ada is out of scope for this question.
As of the other day I started using the gprbuild command that comes with the Windows version of GNAT, in order to get the benefits of a system for managing my applications in a project-related manner. That is, being able to define certain attributes on a per-project basis, rather than manually setting up the compile-phase myself.
Currently when naming my files, their names are based off of what seems to be a standard for the grpbuild, although I could very much be wrong. For periods (in the package structure), a - is put in the name of the file, for underscores, an _ is put accordingly. As such, a package by the name App.Test.File_Utils would have a file name of app-test-file_utils: .ads and .adb accordingly.
In the .gpr project file I have specified:
for Source_Dirs use ("app/src/**");
so that I am allowed to use multiple directories for storing my files, rather than needing to have them all in the same directory.
The Problem
The problem that arises, however, is that file names tend to get very long. As I am already putting the files in a directory based on the package name contained by the file, I was wondering if there is a way to somehow make the compiler understand that the package name can be retrieved from the file's directory name.
That is, rather than having to name the App.Test.File_Utils' file name app-test-file_utils, I would like it to reside under the app/test directory by the name file_utils.
Is this doable, or will I be stuck with the horrors of eventually having to name my files along the lines of: app-test-some-then-one-has-more_files-another_package-knew-test-more-important_package.ads? Granted, I have not missed something about how an Ada application should actually be structured.
What I have tried
I tried looking for answers in the package Naming configuration of the gpr files in the documentation, but to no avail. Furthermore I have been browsing the web for information, but decided it might be better to get help through Stackoverflow, so that other people who might struggle with this problem in the future (granted it is a problem in the first place) might also get help.
Any pointers in the right direction would be very helpful!

In the top-secret GNAT documentation there is a description of how to use non-default file names. It's a great deal of effort. You will probably give up, use the default names, and put them all in a single directory.

You can also simplify much of the effort by using GPS and letting it build your project file as you add files to your source directories.

How can I quickly search my code using unix?

I frequently search for a token, perhaps a function name, throughout my codebase. My traditional method would be to grep for the term itself. However, the codebase is so large that I can't do this efficiently (it takes minutes).
Is there a way to do this efficiently?
ack (which ignores irrelevent files such as revision control files) is still too slow. ctags only finds declarations, which isn't what I need. I thought something like strigi might work, but I haven't tried it.
I'm on linux, using vim and the GNU toolchain, on a largely C++ codebase.

Use find and fgrep. A decent find will restrict the set to files matching some set of criteria, and fgrep will search within each file. Note that fgrep will be faster than grep.
You could also use an IDE's code indexing features. For example, Eclipse will allow you to go to a function's declaration, and it builds the relevant set in the background while you work. Even vim 7 has some such features, although I don't remember exactly what it does beyond code completion.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex