swi prolog 8.0.2 : gziped http - http

I tried to make work a piece of code that opens an http connection.
Nevertheless, web page may transfered as plain text or gziped.
As a result, the code with pragmatism tries to open as plain text and if it fails and receives an exception, tries as if it is gzip encoded.
URL is the sole variable to ground.
Try with URL = 'http://releases.llvm.org/6.0.0/tools/clang/docs/ClangCommandLineReference.html' for instance.
user::catch(
(
user::http_open(URL, DataStream, []),
user::load_html(stream(DataStream), Terms, []),
user::close(DataStream)
),
_
,
(
user::open_any(URL, read, GZipDataStream, CloseIt, [encoding(gzip), string(atom)]),
/*user::http:encoding_filter(gzip, DataStream, GZipDataStream),*/
user::load_html(stream(GZipDataStream), Terms, []),
user::close_any(CloseIt)
)
)
Infortunately, the recovery part of catch doesn't work.
Any suggestion, please ?

The user:: prefixes in the goals suggests that the code you posted is a fragment of Logtalk. If so, it's misusing Logtalk source code and creating a dependency on the SWI-Prolog autoloading mechanism. The code can be rewritten for clarity and resilience. Doing that and fixing the bug in it (library(zlib) must be loaded to make avaialble the http:encoding_filter/3 filter) results in the following solution:
:- use_module(library(http/http_open), []).
:- use_module(library(sgml), []).
:- use_module(library(iostream), []).
:- use_module(library(zlib), []).
:- object(html).
:- public(get_url/2).
% override ambiguous meta-predicate template
:- meta_predicate(sgml:load_html(*,*,*)).
get_url(URL, Terms) :-
catch(
setup_call_cleanup(
http:http_open(URL, DataStream, []),
sgml:load_html(stream(DataStream), Terms, []),
close(DataStream)
),
_,
setup_call_cleanup(
iostream:open_any(URL, read, DataStream, CloseIt, [string(atom)]),
sgml:load_html(stream(DataStream), Terms, []),
iostream:close_any(CloseIt)
)
).
:- end_object.
The setup_call_cleanup/3 calls ensure that the opened streams are closed in case of error.
Assuming the object above is saved in a html.lgt file, the following sample call shows it working for the URL you posted:
?- {html}.
...
% (0 warnings)
true.
?- html::get_url('http://releases.llvm.org/6.0.0/tools/clang/docs/ClangCommandLineReference.html', Terms).
Terms = [element(html, [xmlns='http://www.w3.org/1999/xhtml'], [element(head, [], [element(meta, ['http-equiv'='Content-Type', content='text/html; charset=utf-8'], []), element(title, [], ['Clang command line argument reference — Clang 6 documentation']), element(link, [... = ...|...], []), element(link, [...|...], []), element(..., ..., ...)|...]), element(body, [role=document], [' ', element(div, [... = ...|...], [element(..., ..., ...)|...]), '\n ', element(..., ..., ...)|...])])].

Related

Issues with reading file in erlang

So, I am trying to read and write into a file.
While writing into the file, I need to check if a particular index exist in file then I don't write and throw error.
The data in file will look like this:
{1,{data,dcA,1}}.
{2, {data, dcA, 2}}.
{3,{data,dcA,3}}.
I added the dot at the end of each line because file:consult() needs the file like this.
Which is in this format.
{Index, {Data, Node, Index}}
When I have to add a new file, I check with this Index.
Here's what I have tried so far - https://pastebin.com/apnWLk45
And I run it like this:
193> {ok, P9} = poc:start(test1, self()).
{ok,<0.2863.0>}
194> poc:add(P9, Node, {6, data}).
In poc:add/3, P9 is the process id from the file:open.
I defined before in shell as dcA
And the third is the data - which is in this format - {Index, data}
Since I am using file:consult/1, it takes the filename as parameter. At that point, I only have process id. So I take the name from
file:pid2name(_Server).
This runs perfectly when I run it for the first time.
When I run this again - poc:add(P9, Node, {6, data2}), I get an error in this line file:pid2name(_Server).
exception error: no match of right hand side value undefined
How can I solve this issue?
I am new to Erlang. Just been a week that I started learning.
I am trying to read and write into a file. While writing into the
file, I need to check if a particular index exist in file then I don't
write and throw error.
A DETS table can easily do what you want:
-module(my).
-compile(export_all).
open_table() ->
dets:open_file(my_data, [{type, set}, {file, "./my_data.dets"}]).
close_table() ->
dets:close(my_data).
clear_table() ->
dets:delete_all_objects(my_data).
insert({Key, _Rest}=Data) ->
case dets:member(my_data, Key) of
true -> throw(index_already_exists);
false -> dets:insert(my_data, Data)
end.
all_items() ->
dets:match(my_data, '$1').
In the shell:
~/erlang_programs$ erl
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V9.2 (abort with ^G)
1> c(my).
my.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> my:open_table().
{ok,my_data}
3> my:clear_table().
ok
4> my:all_items().
[]
5> my:insert({1, {data, a, b}}).
ok
6> my:insert({2, {data, c, d}}).
ok
7> my:insert({3, {data, e, f}}).
ok
8> my:all_items().
[[{1,{data,a,b}}],[{2,{data,c,d}}],[{3,{data,e,f}}]]
9> my:insert({1, {data, e, f}}).
** exception throw: index_already_exists
in function my:insert/1 (my.erl, line 15)
When I run this again - poc:add(P9, Node, {6, data2}), I get an error
in this line file:pid2name(_Server):
exception error: no match of right hand side value undefined
When a process opens a file, it becomes linked to a process that handles the file I/O, which means that if the process that opens the file terminates abnormally, the I/O process will also terminate. Here is an example:
-module(my).
-compile(export_all).
start() ->
{ok, Pid} = file:open('data.txt', [read, write]),
spawn(my, add, [Pid, x, y]),
exit("bye").
add(Pid, _X, _Y) ->
timer:sleep(1000), %Let start() process terminate.
{ok, Fname} = file:pid2name(Pid),
io:format("~s~n", [Fname]).
In the shell:
1> c(my).
my.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> my:start().
** exception exit: "bye"
in function my:start/0 (my.erl, line 7)
3>
=ERROR REPORT==== 25-Jun-2018::13:28:48 ===
Error in process <0.72.0> with exit value:
{{badmatch,undefined},[{my,add,3,[{file,"my.erl"},{line,12}]}]}
According to the pid2name() docs:
pid2name(Pid) -> {ok, Filename} | undefined
the function can return undefined, which is what the error message is saying happened.

Function clause error Erlang

I am trying to understand process communication in erlang. Here I have a master process and five friends process. If a friend sends a message to any of the other 5 they have to reply back. But the master should be aware of all this. I am pasting the code below.
-module(prog).
-import(lists,[append/2,concat/1]).
-import(maps,[from_lists/1,find/2,get/2,update/3]).
-import(string,[equal/2]).
-import(file,[consult/1]).
-export([create_process/1,friends/4, master/1, main/0,prnt/1]).
%% CREATE PROCESS
create_process([])->ok;
create_process([H|T])->
{A,B} = H,
Pid = spawn(prog,friends,[B,self(),0,A]),
register(A,Pid),
create_process(T).
%% FRIENDS PROCESS
friends(Msg, M_pid, State, Self_name)->
S = lists:concat([Self_name," state =",State,"\n"]),
io:fwrite(S),
if
State == 0 ->
timer:sleep(500),
io:fwrite("~p~n",[Self_name]),
lists:foreach(fun(X) -> whereis(X)!{Self_name,"intro",self()} end, Msg),
friends(Msg, M_pid, State + 1, Self_name);
State > 0 ->
receive
{Process_name, Process_msg, Process_id} ->
I = equal(Process_msg,"intro"),
R = equal(Process_msg,"reply"),
XxX = lists:concat([Self_name," recieved ",Process_msg," from ",Process_name,"\n"]),
io:fwrite(XxX),
if
I == true ->
io:fwrite("~p~n",[whereis(Process_name)]),
M_pid!{lists:concat([Self_name," received intro message from ", Process_name , "[",Process_id,"]"]), self()},
io:fwrite(I),
whereis(Process_name)!{Self_name, "reply",self()},
friends(Msg, M_pid, State + 1, Self_name);
R == true ->
M_pid!{lists:concat([Self_name," received reply message from ", Process_name , "[",Process_id,"]"]), self()},
io:fwrite(R),
friends(Msg, M_pid, State + 1, Self_name)
end
after
1000->
io:fwrite(lists:concat([Self_name," has received no calls for 1 second, ending..."]))
end
end.
master(State)->
receive
{Process_message, Process_id} ->
io:fwrite(Process_message),
master(State+1)
after
2000->
ok
end.
main() ->
B = [{john, [jill,joe,bob]},
{jill, [bob,joe,bob]},
{sue, [jill,jill,jill,bob,jill]},
{bob, [john]},
{joe, [sue]}],
create_process(B),
io:fwrite("~p~n",[whereis(sue)]),
master(0).
I think the line in friends() function,
M_pid!{lists:concat([Self_name," received intro message from ", Process_name , "[",Process_id,"]"]), self()}
is the cause of error but I cannot understand why. M_pid is known and I am concatenating all the info and sending it to master but I am confused why it isnt working.
The error I am getting is as follows:
Error in process <0.55.0> with exit value: {function_clause,[{lists,thing_to_list,
[<0.54.0>],
[{file,"lists.erl"},{line,603}]},
{lists,flatmap,2,[{file,"lists.erl"},{line,1250}]},
{lists,flatmap,2,[{file,"lists.erl"},{line,1250}]},
{prog,friends,4,[{file,"prog.erl"},{line,45}]}]}
I dont know what is causing the error. Sorry for asking noob questions and thanks for your help.
An example of what Dogbert discovered:
-module(my).
-compile(export_all).
go() ->
Pid = spawn(my, nothing, []),
lists:concat(["hello", Pid]).
nothing() -> nothing.
In the shell:
2> c(my).
my.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,my}
3> my:go().
** exception error: no function clause matching
lists:thing_to_list(<0.75.0>) (lists.erl, line 603)
in function lists:flatmap/2 (lists.erl, line 1250)
in call from lists:flatmap/2 (lists.erl, line 1250)
4>
But:
-module(my).
-compile(export_all).
go() ->
Pid = spawn(my, nothing, []),
lists:concat(["hello", pid_to_list(Pid)]).
nothing() -> nothing.
In the shell:
4> c(my).
my.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,my}
5> my:go().
"hello<0.83.0>"
From the erl docs:
concat(Things) -> string()
Things = [Thing]
Thing = atom() | integer() | float() | string()
The list that you feed concat() must contain either atoms, integers, floats, or strings. A pid is neither an atom, integer, float, nor string, so a pid cannot be used with concat(). However, pid_to_list() returns a string:
pid_to_list(Pid) -> string()
Pid = pid()
As you can see, a pid has its own type: pid().
I ran your code.
Where you went wrong was to pass Process_id(which is of type pid()) to lists:concat/1.
Let us try to understand this error:
{function_clause,[{lists,thing_to_list,
[<0.84.0>],
[{file,"lists.erl"},{line,603}]},
{lists,flatmap,2,[{file,"lists.erl"},{line,1250}]},
{lists,flatmap,2,[{file,"lists.erl"},{line,1250}]},
{prog,friends,4,[{file,"prog.erl"},{line,39}]}]}
It states the function lists:thing_to_list/1 has no definition(see the word function_clause in the error log) which accepts an argument of type pid() as denoted here by [<0.84.0>].
Strings are represented as lists in erlang, which is why we use lists:concat/1.
As #7stud pointed out these are the valid types which can be passed to lists:concat/1 as per the documentation:
atom() | integer() | float() | string()
There are 2 occurrences of the following line. Fix them and you are good to go:
Incorrect Code:
M_pid!{lists:concat([Self_name," received intro message from ", Process_name , "[",Process_id,"]"]), self()},
Corrected Code
M_pid!{lists:concat([Self_name," received intro message from ", Process_name , "[",pid_to_list(Process_id),"]"]), self()},
Notice the use of the function erlang:pid_to_list/1. As per the documentation the function accepts type pid() and returns it as string().

Ejabberd: error in simple module to handle offline messages

I have an Ejabberd 17.01 installation where I need to push a notification in case a recipient is offline. This seems the be a common task and solutions using a customized Ejabberd module can be found everywhere. However, I just don't get it running. First, here's me script:
-module(mod_offline_push).
-behaviour(gen_mod).
-export([start/2, stop/1]).
-export([push_message/3]).
-include("ejabberd.hrl").
-include("logger.hrl").
-include("jlib.hrl").
start(Host, _Opts) ->
?INFO_MSG("mod_offline_push loading", []),
ejabberd_hooks:add(offline_message_hook, Host, ?MODULE, push_message, 10),
ok.
stop(Host) ->
?INFO_MSG("mod_offline_push stopping", []),
ejabberd_hooks:add(offline_message_hook, Host, ?MODULE, push_message, 10),
ok.
push_message(From, To, Packet) ->
?INFO_MSG("mod_offline_push -> push_message", [To]),
Type = fxml:get_tag_attr_s(<<"type">>, Packet), % Supposedly since 16.04
%Type = xml:get_tag_attr_s(<<"type">>, Packet), % Supposedly since 13.XX
%Type = xml:get_tag_attr_s("type", Packet),
%Type = xml:get_tag_attr_s(list_to_binary("type"), Packet),
?INFO_MSG("mod_offline_push -> push_message", []),
ok.
The problem is the line Type = ... line in method push_message; without that line the last info message is logged (so the hook definitely works). When browsing online, I can find all kinds of function calls to extract elements from Packet. As far as I understand it changed over time with new releases. But it's not good, all variants lead in some kind of error. The current way returns:
2017-01-25 20:38:08.701 [error] <0.21678.0>#ejabberd_hooks:run1:332 {function_clause,[{fxml,get_tag_attr_s,[<<"type">>,{message,<<>>,normal,<<>>,{jid,<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>,<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>},{jid,<<"carl">>,<<"xxx.xxx.xxx.xxx">>,<<>>,<<"carl">>,<<"xxx.xxx.xxx.xxx">>,<<>>},[],[{text,<<>>,<<"sfsdfsdf">>}],undefined,[],#{}}],[{file,"src/fxml.erl"},{line,169}]},{mod_offline_push,push_message,3,[{file,"mod_offline_push.erl"},{line,33}]},{ejabberd_hooks,safe_apply,3,[{file,"src/ejabberd_hooks.erl"},{line,382}]},{ejabberd_hooks,run1,3,[{file,"src/ejabberd_hooks.erl"},{line,329}]},{ejabberd_sm,route,3,[{file,"src/ejabberd_sm.erl"},{line,126}]},{ejabberd_local,route,3,[{file,"src/ejabberd_local.erl"},{line,110}]},{ejabberd_router,route,3,[{file,"src/ejabberd_router.erl"},{line,87}]},{ejabberd_c2s,check_privacy_route,5,[{file,"src/ejabberd_c2s.erl"},{line,1886}]}]}
running hook: {offline_message_hook,[{jid,<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>,<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>},{jid,<<"carl">>,<<"xxx.xxx.xxx.xxx">>,<<>>,<<"carl">>,<<"xxx.xxx.xxx.xxx">>,<<>>},{message,<<>>,normal,<<>>,{jid,<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>,<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>},{jid,<<"carl">>,<<"xxx.xxx.xxx.xxx">>,<<>>,<<"carl">>,<<"xxx.xxx.xxx.xxx">>,<<>>},[],[{text,<<>>,<<"sfsdfsdf">>}],undefined,[],#{}}]}
I'm new Ejabberd and Erlang, so I cannot really interpret the error, but the Line 33 as mentioned in {mod_offline_push,push_message,3,[{file,"mod_offline_push.erl"}, {line,33}]} is definitely the line calling get_tag_attr_s.
UPDATE 2017/01/27: Since this cost me a lot of headache -- and I'm still not perfectly happy -- I post here my current working module in the hopes it might help others. My setup is Ejabberd 17.01 running on Ubuntu 16.04. Most stuff I tried and failed with seem to for older versions of Ejabberd:
-module(mod_fcm_fork).
-behaviour(gen_mod).
%% public methods for this module
-export([start/2, stop/1]).
-export([push_notification/3]).
%% included for writing to ejabberd log file
-include("ejabberd.hrl").
-include("logger.hrl").
-include("xmpp_codec.hrl").
%% Copied this record definition from jlib.hrl
%% Including "xmpp_codec.hrl" and "jlib.hrl" resulted in errors ("XYZ already defined")
-record(jid, {user = <<"">> :: binary(),
server = <<"">> :: binary(),
resource = <<"">> :: binary(),
luser = <<"">> :: binary(),
lserver = <<"">> :: binary(),
lresource = <<"">> :: binary()}).
start(Host, _Opts) ->
?INFO_MSG("mod_fcm_fork loading", []),
% Providing the most basic API to the clients and servers that are part of the Inets application
inets:start(),
% Add hook to handle message to user who are offline
ejabberd_hooks:add(offline_message_hook, Host, ?MODULE, push_notification, 10),
ok.
stop(Host) ->
?INFO_MSG("mod_fcm_fork stopping", []),
ejabberd_hooks:add(offline_message_hook, Host, ?MODULE, push_notification, 10),
ok.
push_notification(From, To, Packet) ->
% Generate JID of sender and receiver
FromJid = lists:concat([binary_to_list(From#jid.user), "#", binary_to_list(From#jid.server), "/", binary_to_list(From#jid.resource)]),
ToJid = lists:concat([binary_to_list(To#jid.user), "#", binary_to_list(To#jid.server), "/", binary_to_list(To#jid.resource)]),
% Get message body
MessageBody = Packet#message.body,
% Check of MessageBody is not empty
case MessageBody/=[] of
true ->
% Get first element (no idea when this list can have more elements)
[First | _ ] = MessageBody,
% Get message data and convert to string
MessageBodyText = binary_to_list(First#text.data),
send_post_request(FromJid, ToJid, MessageBodyText);
false ->
?INFO_MSG("mod_fcm_fork -> push_notification: MessageBody is empty",[])
end,
ok.
send_post_request(FromJid, ToJid, MessageBodyText) ->
%?INFO_MSG("mod_fcm_fork -> send_post_request -> MessageBodyText = ~p", [Demo]),
Method = post,
PostURL = gen_mod:get_module_opt(global, ?MODULE, post_url,fun(X) -> X end, all),
% Add data as query string. Not nice, query body would be preferable
% Problem: message body itself can be in a JSON string, and I couldn't figure out the correct encoding.
URL = lists:concat([binary_to_list(PostURL), "?", "fromjid=", FromJid,"&tojid=", ToJid,"&body=", edoc_lib:escape_uri(MessageBodyText)]),
Header = [],
ContentType = "application/json",
Body = [],
?INFO_MSG("mod_fcm_fork -> send_post_request -> URL = ~p", [URL]),
% ADD SSL CONFIG BELOW!
%HTTPOptions = [{ssl,[{versions, ['tlsv1.2']}]}],
HTTPOptions = [],
Options = [],
httpc:request(Method, {URL, Header, ContentType, Body}, HTTPOptions, Options),
ok.
Actually it fails with second arg Packet you pass to fxml:get_tag_attr_s in push_message function
{message,<<>>,normal,<<>>,
{jid,<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>,
<<"homer">>,<<"xxx.xxx.xxx.xxx">>,<<"conference">>},
{jid,<<"carl">>,<<"xxx.xxx.xxx.xxx">>,<<>>,<<"carl">>,
<<"xxx.xxx.xxx.xxx">>,<<>>},
[],
[{text,<<>>,<<"sfsdfsdf">>}],
undefined,[],#{}}
because it is not xmlel
Looks like it is record "message" defined in tools/xmpp_codec.hrl
with <<>> id and type 'normal'
xmpp_codec.hrl
-record(message, {id :: binary(),
type = normal :: 'chat' | 'error' | 'groupchat' | 'headline' | 'normal',
lang :: binary(),
from :: any(),
to :: any(),
subject = [] :: [#text{}],
body = [] :: [#text{}],
thread :: binary(),
error :: #error{},
sub_els = [] :: [any()]}).
Include this file and use just
Type = Packet#message.type
or, if you expect binary value
Type = erlang:atom_to_binary(Packet#message.type, utf8)
The newest way to do that seems to be with xmpp:get_type/1:
Type = xmpp:get_type(Packet),
It returns an atom, in this case normal.

Create a portal_user_catalog and have it used (Plone)

I'm creating a fork of my Plone site (which has not been forked for a long time). This site has a special catalog object for user profiles (a special Archetypes-based object type) which is called portal_user_catalog:
$ bin/instance debug
>>> portal = app.Plone
>>> print [d for d in portal.objectMap() if d['meta_type'] == 'Plone Catalog Tool']
[{'meta_type': 'Plone Catalog Tool', 'id': 'portal_catalog'},
{'meta_type': 'Plone Catalog Tool', 'id': 'portal_user_catalog'}]
This looks reasonable because the user profiles don't have most of the indexes of the "normal" objects, but have a small set of own indexes.
Since I found no way how to create this object from scratch, I exported it from the old site (as portal_user_catalog.zexp) and imported it in the new site. This seemed to work, but I can't add objects to the imported catalog, not even by explicitly calling the catalog_object method. Instead, the user profiles are added to the standard portal_catalog.
Now I found a module in my product which seems to serve the purpose (Products/myproduct/exportimport/catalog.py):
"""Catalog tool setup handlers.
$Id: catalog.py 77004 2007-06-24 08:57:54Z yuppie $
"""
from Products.GenericSetup.utils import exportObjects
from Products.GenericSetup.utils import importObjects
from Products.CMFCore.utils import getToolByName
from zope.component import queryMultiAdapter
from Products.GenericSetup.interfaces import IBody
def importCatalogTool(context):
"""Import catalog tool.
"""
site = context.getSite()
obj = getToolByName(site, 'portal_user_catalog')
parent_path=''
if obj and not obj():
importer = queryMultiAdapter((obj, context), IBody)
path = '%s%s' % (parent_path, obj.getId().replace(' ', '_'))
__traceback_info__ = path
print [importer]
if importer:
print importer.name
if importer.name:
path = '%s%s' % (parent_path, 'usercatalog')
print path
filename = '%s%s' % (path, importer.suffix)
print filename
body = context.readDataFile(filename)
if body is not None:
importer.filename = filename # for error reporting
importer.body = body
if getattr(obj, 'objectValues', False):
for sub in obj.objectValues():
importObjects(sub, path+'/', context)
def exportCatalogTool(context):
"""Export catalog tool.
"""
site = context.getSite()
obj = getToolByName(site, 'portal_user_catalog', None)
if tool is None:
logger = context.getLogger('catalog')
logger.info('Nothing to export.')
return
parent_path=''
exporter = queryMultiAdapter((obj, context), IBody)
path = '%s%s' % (parent_path, obj.getId().replace(' ', '_'))
if exporter:
if exporter.name:
path = '%s%s' % (parent_path, 'usercatalog')
filename = '%s%s' % (path, exporter.suffix)
body = exporter.body
if body is not None:
context.writeDataFile(filename, body, exporter.mime_type)
if getattr(obj, 'objectValues', False):
for sub in obj.objectValues():
exportObjects(sub, path+'/', context)
I tried to use it, but I have no idea how it is supposed to be done;
I can't call it TTW (should I try to publish the methods?!).
I tried it in a debug session:
$ bin/instance debug
>>> portal = app.Plone
>>> from Products.myproduct.exportimport.catalog import exportCatalogTool
>>> exportCatalogTool(portal)
Traceback (most recent call last):
File "<console>", line 1, in <module>
File ".../Products/myproduct/exportimport/catalog.py", line 58, in exportCatalogTool
site = context.getSite()
AttributeError: getSite
So, if this is the way to go, it looks like I need a "real" context.
Update: To get this context, I tried an External Method:
# -*- coding: utf-8 -*-
from Products.myproduct.exportimport.catalog import exportCatalogTool
from pdb import set_trace
def p(dt, dd):
print '%-16s%s' % (dt+':', dd)
def main(self):
"""
Export the portal_user_catalog
"""
g = globals()
print '#' * 79
for a in ('__package__', '__module__'):
if a in g:
p(a, g[a])
p('self', self)
set_trace()
exportCatalogTool(self)
However, wenn I called it, I got the same <PloneSite at /Plone> object as the argument to the main function, which didn't have the getSite attribute. Perhaps my site doesn't call such External Methods correctly?
Or would I need to mention this module somehow in my configure.zcml, but how? I searched my directory tree (especially below Products/myproduct/profiles) for exportimport, the module name, and several other strings, but I couldn't find anything; perhaps there has been an integration once but was broken ...
So how do I make this portal_user_catalog work?
Thank you!
Update: Another debug session suggests the source of the problem to be some transaction matter:
>>> portal = app.Plone
>>> puc = portal.portal_user_catalog
>>> puc._catalog()
[]
>>> profiles_folder = portal.some_folder_with_profiles
>>> for o in profiles_folder.objectValues():
... puc.catalog_object(o)
...
>>> puc._catalog()
[<Products.ZCatalog.Catalog.mybrains object at 0x69ff8d8>, ...]
This population of the portal_user_catalog doesn't persist; after termination of the debug session and starting fg, the brains are gone.
It looks like the problem was indeed related with transactions.
I had
import transaction
...
class Browser(BrowserView):
...
def processNewUser(self):
....
transaction.commit()
before, but apparently this was not good enough (and/or perhaps not done correctly).
Now I start the transaction explicitly with transaction.begin(), save intermediate results with transaction.savepoint(), abort the transaction explicitly with transaction.abort() in case of errors (try / except), and have exactly one transaction.commit() at the end, in the case of success. Everything seems to work.
Of course, Plone still doesn't take this non-standard catalog into account; when I "clear and rebuild" it, it is empty afterwards. But for my application it works well enough.

Download pdf file from wikipedia

Wikipedia provides a link (left side on Print/export) on every article to download the article as pdf. I wrote a small Haskell script which first gets the Wikipedia link and output the rendering link. When I am giving the rendering url as input, I am getting empty tags but the same url in browser provides download link.
Could someone please tell me how to solve this problem? Formated code on ideone.
import Network.HTTP
import Text.HTML.TagSoup
import Data.Maybe
parseHelp :: Tag String -> Maybe String
parseHelp ( TagOpen _ y ) = if any ( \( a , b ) -> b == "Download a PDF version of this wiki page" ) y
then Just $ "http://en.wikipedia.org" ++ snd ( y !! 0 )
else Nothing
parse :: [ Tag String ] -> Maybe String
parse [] = Nothing
parse ( x : xs )
| isTagOpen x = case parseHelp x of
Just s -> Just s
Nothing -> parse xs
| otherwise = parse xs
main = do
x <- getLine
tags_1 <- fmap parseTags $ getResponseBody =<< simpleHTTP ( getRequest x ) --open url
let lst = head . sections ( ~== "<div class=portal id=p-coll-print_export>" ) $ tags_1
url = fromJust . parse $ lst --rendering url
putStrLn url
tags_2 <- fmap parseTags $ getResponseBody =<< simpleHTTP ( getRequest url )
print tags_2
If you try requesting the URL through some external tool like wget, you will see that Wikipedia does not serve up the result page directly. It actually returns a 302 Moved Temporarily redirect.
When entering this URL in a browser, it will be fine, as the browser will follow the redirect automatically. simpleHTTP, however, will not. simpleHTTP is, as the name suggests, rather simple. It does not handle things like cookies, SSL or redirects.
You'll want to use the Network.Browser module instead. It offers much more control over how the requests are done. In particular, the setAllowRedirects function will make it automatically follow redirects.
Here's a quick and dirty function for downloading an URL into a String with support for redirects:
import Network.Browser
grabUrl :: String -> IO String
grabUrl url = fmap (rspBody . snd) . browse $ do
-- Disable logging output
setErrHandler $ const (return ())
setOutHandler $ const (return ())
setAllowRedirects True
request $ getRequest url

Resources