I want to make a small plugin for awesome WM that will show a number of unread messages pending in Thunderbird. I want to fetch the number of messages accessing directly Thunderbird sqlite base. The question is: which base, table and fields should I query?
There are at least 15 bases under ~/.thunderbird/profile/, including ./global-messages-db.sqlite. In this base I tried messageAttributes table, but with no big success. I could not find development documentation describing the attributes...
Any help here?
You will find what you need in the global-messages-db.sqlite file. If you watch the messages table, you will find a column jsonAttributes. In there you will find a JSON array, from attribute ids and their value. The key 58 is the read-status of a message. So if you find something like {"58": false} in this column the message is still unread. But this database won't be updated immediately when a new message is received. (It might even be updated only, when you close Thunderbird -- I am not sure about that.)
So as you see finding unread messages that way will be a bit of the hard way to go. I would recommend you to better create a plugin, that is directly checking the server via IMAP or POP3.
For IMAP servers there already exist an awesome-plugin inside the Delightful Extensions. I don't know of any POP3 plugin, and as it seems POP3 libs for lua are also rare to find.
Related
My project head is telling me that its unacceptable to connect with NCBI to retrieve sequence entries without sending along identifying information such as our institution email. They claim this means NCBI won't instantly block our connection if we violate their user guidelines, they'll 'email' us first. We are using Rstudio with the Rentrez package to retrieve protein sequences from NCBI Genbank.
But I'm not certain that's necessary or IF rentrez has any way to even do that. For reference this is general format of our code.
sequence <- entrez_fetch(db="nuccore", id=**accession_number**, rettype="fasta")
Rentrez says on their documentation: "The NCBI will ban IPs that don't use EUtils within their user guidelines. In particular /enumerated /item Don't send more than three request per second (rentrez enforces this limit) /item If you plan on sending a sequence of more than ~100 requests, do so outside of peak times for the US /item For large requests use the web history method (see examples for entrez_search or use entrez_post to upload IDs)"
Both entrez_search and entrez_post include an argument called "web_history A web_history object for use in subsequent calls to NCBI" I'm not sure if this is what I'm looking for though.
I can't find any arguments or functions etc. which allow the user to send identifying information to NCBI when connecting.
It seems like you need an API key. You can get one from your NCBI account interactively, and it needs to be specified in your .bash_profile (at least on a mac, using bash, not sure your OS / terminal of choice here).
For command line usage it just needs to be set as a variable with the following line added to your profile:
export NCBI_API_KEY=<yourkeyehere>
Then as long as R is loading up that profile when it spins up, you should be fine.
EDIT:
A bit of a tangential note here, you can grab files from the FTP site with utilities like curl and wget, or even Biostrings' functions like readDNAStringSet() without an API key, but if you're going to access things with eutils, you need one - as long as you're going OVER the X-number of queries per second - but if you're under that threshold, i don't think they care that much.
I want to know which tables are being read by a query.
for each Customer where CustomerID = 12345.
Eventually this customer will be found in the following example, but progress must 'read' many tables before getting to customer 12345.
How do I know exactly which tables are read (By CustomerID), prior to getting to customer 12345?
*NOTE: I do not have access to modify the code being run for this selection. Ideally I would run a separate set of code that is executed at the same time as the customer query above to track the reads.
EDIT: More clearly - Can you track reads from a given program (.p) OR ProcessID and output either a RECID or the PrimaryKey to a file?
I understand the information is being read off the Disk and probably stored in a database buffer. So how would I get at the information in the database buffer?
You seem to be mixing up a few different things.
In a situation like your example where you FIND a specific record in one, and only one table then there is just a single record read. Progress will find that record by first scanning a relevant index. That might be 2 or 3 "logical reads" of the b-tree to get to the proper node. The record block and index blocks may, or may not be read from disk - that depends on what has happened previously.
There are "Virtual System Tables" available that can tell you how many READ operations take place against a particular table or index. But they do not trace the specific ROWID or other identifying data. _TableStat and _IndexStat are aggregates for all users on the system, _UserTableStat and _UserIndexStat are specific to a particular user's activity. You do need to set the -tablerangesize and -indexrangesize parameters adequately to take advantage of these.
If you have enabled the table and index statistics then you can use a tool like ProTop - http://protop.wss.com to get insight into this activity. Or you can write your own code.
OpenEdge Auditing does not track reads. That would be prohibitively expensive.
It's probably not really a good idea but, in theory, you could write FIND triggers for the tables you are interested in. That doesn't require access to the application source but you would need a development license. It will probably kill performance to do this though - so unless this is a non-production test environment that you just want to fiddle with I wouldn't really do that.
You mention wanting to know how you got to that point. That sounds more like you might need to have a "4gl trace". One easy way to get the stack trace of a running process is to execute:
$DLC/bin/proGetStack PID (UNIX)
or
%DLC%\bin\proGetStack PID (Windows)
This command will generate a "protrace.pid" file containing a 4gl stack trace and other interesting information.
There are also more complicated ways to get that info like using PROMON and the "client statement cache" or setting various log entry types at session startup. But proGetStack is pretty convenient and requires no code or scripting changes.
Some great options from Tom above. And all of them may be relevant to you. The option he only skirts around is the logging options. I feel obliged to expand on this because I'm giving a talk on it in a couple of weeks!
Assuming you are running a modern version of Progress, or even 10.2B08, then you have client logging available to you. Start your session with these additional options:
-clientlog "\somefolder\somefile.txt"
-logentrytypes "QryInfo:3"
This will log all the info of all the queries in your session to the file you specified above. If you navigate to the point in the system where you want to analyse your query and empty the logfile and save it, you can then run the offending query and see all the detail you need.
The output tells you all sorts of useful info, including the number of reads on each table, compared with the number returned to the user. You also get the index selected.
Using Tom's advice and/or this will get you what you need.
Again I come to you guys for your expertise and advice on an issue that I am having. I was wondering if any of you would know how to detect if a web page has been modified using VB.NET. I need to be able to set up a task which periodically (like once a week) scans the user inputted web pages and if the web page content has changed, I need to fire off an email to an individual that it has changed (not the exact location on the page itself). I'll be storing the HTTP status and of course the page data itself as well as the date of when it was last modified. Of course this needs to be very fault tolerant since it could be another week before the check runs again. Any help would be great. Thank you.
EDIT
New twist on this question sorry. I had more time to think about what we wanted. So... Detecting ANY change on a web page would be kind of silly since time dependent elements of the page would change every so often. Instead, what I would like to do is be able to detect the documents in the page. For instance if there are excel, word docs, or pdfs that get changed on that page. So, I'd run the hash on these documents then on some sort of schedule do a check to see if new documents have been added or if the old documents have been modified. Any suggestions on how to detect the documents embedded on the page and running the hash? Thanks again!
As I mentioned in a comment, this sort of job is what checksums (also known as hash functions) were designed for.
You code for will look something like this:
- for each webpage of interest
- pull webbpage
- calculate checksum of contents
- is current checksum different to last checksum?
- if yes, send email
- store new checksum and other appropriate data
The .Net framework has a number of checksums available. The two most popular are MD5 and sha1
In addition to the checksum option, there are also various Diff function that achieve this, and provide much more information than changed=true/false. This question has more info:
How to tell when a web page has changed by x% in VB.net?
I'm thinking of a architectural way of displaying messages in our application (Flex-Asp.NET-SqlServer), mostly messages that announce for instance a downtime.
Currently I was thinking of creating a table FlexMessage that holds the name of a message (based on that name I now where to put in Flex) and the value (the message itself). As a result however, someone will have to create these messages and also delete them when they are no longer valid. So, thinking further, I thought of creating messages having a startdate and enddate, so an interval in which they need to be displayed. Like this, someone could login to the management part and create a message that needs to be displayed from a certain date until a certain date.
I could also hardcode it in the Flex Application, but that would mean putting a new build online (of the swf) each time something changes with a certain message. No good idea I guess.
Is there a better way for this that I haven't thought about?
One way to do this is to place your messages in an RSS feed, then read that feed from the Flex application.
There is an example of how to do this here: http://www.artima.com/weblogs/viewpost.jsp?thread=23819
I've been playing with RSS feeds this week, and for my next trick I want to build one for our internal application log. We have a centralized database table that our myriad batch and intranet apps use for posting log messages. I want to create an RSS feed off of this table, but I'm not sure how to handle the volume- there could be hundreds of entries per day even on a normal day. An exceptional make-you-want-to-quit kind of day might see a few thousand. Any thoughts?
I would make the feed a static file (you can easily serve thousands of these), regenerated periodically. Then you have a much broader choice, because it doesn't have to run below second, it can run even minutes. And users still get perfect download speed and reasonable update speed.
If you are building a system with notifications that must not be missed, then a pub-sub mechanism (using XMPP, one of the other protocols supported by ApacheMQ, or something similar) will be more suitable that a syndication mechanism. You need some measure of coupling between the system that is generating the notifications and ones that are consuming them, to ensure that consumers don't miss notifications.
(You can do this using RSS or Atom as a transport format, but it's probably not a common use case; you'd need to vary the notifications shown based on the consumer and which notifications it has previously seen.)
I'd split up the feeds as much as possible and let users recombine them as desired. If I were doing it I'd probably think about using Django and the syndication framework.
Django's models could probably handle representing the data structure of the tables you care about.
You could have a URL that catches everything, like: r'/rss/(?(\w*?)/)+' (I think that might work, but I can't test it now so it might not be perfect).
That way you could use URLs like (edited to cancel the auto-linking of example URLs):
http:// feedserver/rss/batch-file-output/
http:// feedserver/rss/support-tickets/
http:// feedserver/rss/batch-file-output/support-tickets/ (both of the first two combined into one)
Then in the view:
def get_batch_file_messages():
# Grab all the recent batch files messages here.
# Maybe cache the result and only regenerate every so often.
# Other feed functions here.
feed_mapping = { 'batch-file-output': get_batch_file_messages, }
def rss(request, *args):
items_to_display = []
for feed in args:
items_to_display += feed_mapping[feed]()
# Processing/returning the feed.
Having individual, chainable feeds means that users can subscribe to one feed at a time, or merge the ones they care about into one larger feed. Whatever's easier for them to read, they can do.
Without knowing your application, I can't offer specific advice.
That said, it's common in these sorts of systems to have a level of severity. You could have a query string parameter that you tack on to the end of the URL that specifies the severity. If set to "DEBUG" you would see every event, no matter how trivial. If you set it to "FATAL" you'd only see the events that that were "System Failure" in magnitude.
If there are still too many events, you may want to sub-divide your events in to some sort of category system. Again, I would have this as a query string parameter.
You can then have multiple RSS feeds for the various categories and severities. This should allow you to tune the level of alerts you get an acceptable level.
In this case, it's more of a manager's dashboard: how much work was put into support today, is there anything pressing in the log right now, and for when we first arrive in the morning as a measure of what went wrong with batch jobs overnight.
Okay, I decided how I'm gonna handle this. I'm using the timestamp field for each column and grouping by day. It takes a little bit of SQL-fu to make it happen since of course there's a full timestamp there and I need to be semi-intelligent about how I pick the log message to show from within the group, but it's not too bad. Further, I'm building it to let you select which application to monitor, and then showing every message (max 50) from a specific day.
That gets me down to something reasonable.
I'm still hoping for a good answer to the more generic question: "How do you syndicate many important messages, where missing a message could be a problem?"