How to parse XML with cxml and stp containing ampersand - common-lisp

I want to parse the following XML-Code:
(cxml:parse "<BEGIN><URL>www.some.de/url?some=data&bad=stuff</URL></BEGIN>" (stp:make-builder))
this results in
#<CXML:WELL-FORMEDNESS-VIOLATION "~A" {1003C5E163}>
as '&' is a XML special character. But if I use &? instead the result is:
(cxml:parse "<BEGIN><URL>www.some.de/url?some=data&bad=stuff</URL></BEGIN>" (stp:make-builder))
=>#.(CXML-STP-IMPL::DOCUMENT
:CHILDREN '(#.(CXML-STP:ELEMENT
#| :PARENT of type DOCUMENT |#
:CHILDREN '(#.(CXML-STP:ELEMENT
#| :PARENT of type ELEMENT |#
:CHILDREN '(#.(CXML-STP:TEXT
#| :PARENT of type ELEMENT |#
:DATA "www.some.de/url?some=data")
#.(CXML-STP:TEXT
#| :PARENT of type ELEMENT |#
:DATA "&")
#.(CXML-STP:TEXT
#| :PARENT of type ELEMENT |#
:DATA "bad=stuff"))
:LOCAL-NAME "URL"))
:LOCAL-NAME "BEGIN")))
Which is not exactly what I expected as there should only be one CXML-STP:TEXT child with DATA "www.some.de/url?some=data&bad=stuff"
How can I fix this wrong(?) behavior?

This behavior, although, not very convenient, is, actually, present in many other XML parsers as well. Probably the reason for it is to be able to parse arbitrary ​XML entities and apply some user-defined rules to them. Although, it may be just a by-product of the parser implementation. I couldn't find out yet.
For the SAX variant of the parser I came to the following approach:
(defclass my-sax (sax:sax-parser-mixin)
((title :accessor title :initform nil)
(tag :accessor tag :initform nil)
(text :accessor text :initform "")))
(defmethod sax:start-element ((sax my-sax) namespace-uri local-name
qname attributes)
(with-slots (tag tagcount text) sax
(setf tag local-name
text "")))
(defmethod sax:characters ((sax my-sax) data)
(with-slots (title tag text) sax
(switch (tag :test 'string=)
("text" (setf text (conatenate 'string text data)))
("title" (setf title data)))))
(defmethod sax:end-element ((sax my-sax) namespace-uri local-name qname)
(with-slots (title tag text) sax
(when (string= "text" local-name)
;; process (text sax)
)))
I.e. I collect the text in sax:characters and process it in sax:end-element. In STP you, probably, can get away even easier by just concatenating neighboring text elements.

Related

Controlling the printing of special cons forms (e.g printing (function +) as #'+ etc)

I want some reader macros to print as as shortened expression that the macro understands. Lets say I want to extend the #' macro to take #'~[rest-of-symbol] and turn that into (complement #'rest-of-symbol).
What controls how that is printed? On SBCL, for instance, '(function +) prints as #'+. How do i make '(complement #'listp) print as #~listp?
My first thought was
(defmethod print-object :around ((obj cons) stream)
;; if #'~fn-name / (complement (function fn-name))
;; => fn-name otherwise NIL
(let ((fn-name
(ignore-errors
(destructuring-bind (complement (function fn-name))
obj
(when (and (eq complement 'complement)
(eq function 'function))
fn-name)))))
(if fn-name
(format stream "#'~~~S" fn-name)
(call-next-method))))
This works insofar as (print-object '(complement #'evenp) *standard-output*) prints it the way I want, but the REPL doesn't. Also (print-object '#'+ *standard-output*) prints it as (function +) so the REPL isn't using print-object. With defining the print-object method for user defined classes the REPL always picks up on the new definition.
This is my first post and I'm sorry I can't get the code to format properly. If someone can put a link on how to do that I would appreciate it.
Evaluation time
You are mixing code with data in your example:
(function +)
Is a special form that evaluates to a function object, which admits a shorter syntax:
#'+
But when you are writing:
'(function +)
or
'(complement fn)
Then in both cases you are writing quoted, literal lists, which evaluates to themselves (namely a list starting with symbol function or complement, followed respectively by symbol + and fn).
However, you want the code to be evaluated at runtime to actual function objects; if you type this in the REPL:
(complement #'alpha-char-p)
The result is a value that is printed as follows:
#<FUNCTION (LAMBDA (&REST SB-IMPL::ARGUMENTS) :IN COMPLEMENT) {101AAC8D9B}>
You have an actual function object that you can funcall. In other words, by the time you reach print-object, you no longer have access to source code, you are manipulating data at runtime which happens to be functions. So you cannot use destructuring-bind to get the complement symbol that was present in the source code.
What you need to do instead is to attach metadata to your function. There is a way to do that in Common Lisp by defining a new type of function, thanks to the Meta-Object Protocol.
Funcallable objects
I'm relying on Closer-MOP for all the symbols prefixed with c2cl: below. I define a new class of functions, annotated-fn, which is a function with addditional data:
(defclass annotated-fn (c2cl:funcallable-standard-object)
((data :initform :data :initarg :data :reader annotated-fn-data))
(:metaclass c2cl:funcallable-standard-class))
Notice that this class is a funcallable-standard-object (like the usual functions), and its metaclass is funcallable-standard-class. Such an object has an additional implicit slot that is a function to call.
More precisely, you have to call c2cl:set-funcallable-instance-function to set a function associated with the object, and when later you use funcall or apply with the object, then the wrapped function is called instead. So you can transparently use this class of functions wherever you usually use a function. It just has additional slots (here data).
For example, here is how I instantiate it, with a function to wrap additional data:
(defun annotate-fn (function data)
(let ((object (make-instance 'annotated-fn :data data)))
(prog1 object
(c2cl:set-funcallable-instance-function object function))))
Let's try it:
(describe
(annotate-fn (constantly 3)
'(:category :constantly)))
#<ANNOTATED-FN {1006275C7B}>
[funcallable-instance]
Lambda-list: UNKNOWN
Derived type: FUNCTION
Documentation:
T
Source file: SYS:SRC;CODE;FUNUTILS.LISP
Slots with :INSTANCE allocation:
DATA = (:CATEGORY :CONSTANTLY)
You can also use this object like any other function.
Now, your reader macros can expand into calls to annotate-fn, and add any kind of additional metadata you need to the function.
Reader macro
For our example, imagine you define a reader macros for constant functions:
(set-macro-character #\[ 'read-constantly t)
(set-macro-character #\] (get-macro-character #\) nil))
(defun read-constantly (stream char)
(declare (ignore char))
(let* ((list (read-delimited-list #\] stream t))
(value (if (rest list) list (first list)))
(var (gensym)))
`(let ((,var ,value))
(annotate-fn (constantly ,var)
(list :category :constantly
:constant ,var)))))
Using this syntax:
> [(+ 8 5)]
=> #<ANNOTATED-FN ...>
By the way, the syntax I defined also allows the following:
> [+ 8 5]
Pretty-printing
Let's define a generic function that prints an annotated function given its :category field:
(defgeneric print-for-category (category data object stream))
(defmethod print-object ((o annotated-fn) s)
(let* ((data (annotated-fn-data o))
(category (getf data :category)))
(print-for-category category data o s)))
Then, we can specialize it for :constantly, and here we assume also that the data associated with the function contains a :constant field:
(defmethod print-for-category ((_ (eql :constantly)) data o s)
(format s "[~s]" (getf data :constant)))
For example:
(let ((value (+ 8 6)))
(annotate-fn (constantly value)
`(:constant ,value
:category :constantly)))
This above is printed as:
[14]
Which would be the same as your hypothetical reader macro.
To do this you need to understand the pretty printer. I have understood it in the past but no longer do completely. It dispatches on type and the trick for things like this is that you can specify very specific types for trees of conses, although doing so is verbose.
Here is an example which is almost certainly not completely correct, but does achieve what you want in this case:
(defparameter *ppd* (copy-pprint-dispatch))
(defun pprint-complement-function (s form)
;; This is the thing that the pretty printer will call. It can
;; assume that the form it wants to print is already correct.
(destructuring-bind (complement (function name)) form
(declare (ignore complement function))
(format s "#'~~~W" name)))
;;; Now set this in the table with a suitable hairy type specification
;;;
(set-pprint-dispatch '(cons (eql complement)
(cons (cons (eql function)
(cons t null))
null))
'pprint-complement-function
0
*ppd*)
And now
> (let ((*print-pprint-dispatch* *ppd*))
(pprint '(complement (function foo)))
(pprint '((complement (function foo)) (function foo))))
#'~foo
(#'~foo #'foo)
You can make the awful nested cons type specifier easier by defining this (which, perhaps, should be the compound type specifier for list except you can't do that):
(deftype list-of-types (&rest types)
(labels ((lot (tt)
(if (null tt)
'null
`(cons ,(first tt) ,(lot (rest tt))))))
(lot types)))
And then
(set-pprint-dispatch '(list-of-types (eql complement)
(list-of-types (eql function)
*))
'pprint-complement-function
0
*ppd*)
is perhaps easier to read.

How to produce HTML from a list

The usual way to generate HTML using CL-WHO is by using the macros with-html-output and with-html-output-to-string. This uses special syntax. For example:
(let ((id "greeting")
(message "Hello!"))
(cl-who:with-html-output-to-string (*standard-output*)
((:p :id id) message)))
Is it possible to write the data ((:p :id id) message) as a list instead of using the macro syntax shown above? For example, I would like to define the HTML as a list like this:
(let* ((id "greeting")
(message "Hello!")
(the-html `((:p :id ,id) ,message)))
;; What should I do here to generate HTML using CL-WHO?
)
Can CL-WHO take a normal Lisp list and produce HTML from the list?
You want to insert code into an expression.
Actually you would need eval:
(let* ((id "greeting")
(message "Hello!")
(the-html `((:p :id ,id) ,message)))
(eval `(cl-who:with-html-output-to-string (*standard-output*)
,the-html)))
But this is not good to use eval.
But a macro contains an implicite eval.
I would define a macro for this and call the macro:
(defun html (&body body)
`(cl-who:with-html-output-to-string (*standard-output*)
,#body))
;; but still one doesn't get rid of the `eval` ...
;; one has to call:
(let* ((id "greeting")
(message "Hello!")
(the-html `((:p :id ,id) ,message)))
(eval `(html ,the-html)))

Custom slot options don't apply any reduction to its argument

Say if I define a metaclass that enhances standard slots with a validator slot, when I pass :validator (clavier:valid-email "The email is invalid") as an option, instead of storing the result of of the expression, which is a funcallable, it stores the expression itself. Am I'm missing a step when extending the standard slots? How do I ensure the expression is evaluated before stored? I'm using SBCL 1.2.11 btw. Here is the code in question
(unless (find-package 'clavier)
(ql:quickload :clavier))
(unless (find-package 'c2mop)
(ql:quickload :c2mop))
(defpackage #:clos2web/validation
(:use #:cl)
(:import-from #:c2mop
#:standard-class
#:standard-direct-slot-definition
#:standard-effective-slot-definition
#:validate-superclass
#:direct-slot-definition-class
#:effective-slot-definition-class
#:compute-effective-slot-definition
#:slot-value-using-class))
(in-package #:clos2web/validation)
(defun true (value)
"Always return true."
(declare (ignore value))
t)
(defclass validation-class (standard-class)
()
(:documentation "Meta-class for objects whose slots know how to validate
their values."))
(defmethod validate-superclass
((class validation-class) (super standard-class))
t)
(defmethod validate-superclass
((class standard-class) (super validation-class))
t)
(defclass validation-slot (c2mop:standard-slot-definition)
((validator :initarg :validator :accessor validator :initform #'true
:documentation "The function to determine if the value is
valid. It takes as a parameter the value.")))
(defclass validation-direct-slot (validation-slot
standard-direct-slot-definition)
())
(defclass validation-effective-slot (validation-slot
standard-effective-slot-definition)
())
(defmethod direct-slot-definition-class ((class validation-class) &rest initargs)
(declare (ignore initargs))
(find-class 'validation-direct-slot))
(defmethod effective-slot-definition-class ((class validation-class) &rest initargs)
(declare (ignore initargs))
(find-class 'validation-effective-slot))
(defmethod compute-effective-slot-definition
((class validation-class) slot-name direct-slot-definitions)
(let ((effective-slot-definition (call-next-method)))
(setf (validator effective-slot-definition)
(some #'validator direct-slot-definitions))
effective-slot-definition))
(defmethod (setf slot-value-using-class) :before
(new (class validation-class) object (slot validation-effective-slot))
(when (slot-boundp slot 'validator)
(multiple-value-bind (validp msg)
(funcall (validator slot) new)
(unless validp
(error msg)))))
;; Example usage
(defclass user ()
((name :initarg :name)
(email :initarg :email :validator (clavier:valid-email "The email is invalid") :accessor email))
(:metaclass validation-class))
(let ((pepe (make-instance 'user :name "Pepe" :email "pepe#tumadre.com")))
(setf (email pepe) "FU!")) ;; should throw
The code fails when making an instance as (CLAVIER:VALID-EMAIL "The email is invalid") is not a funcallable.
(CLAVIER:VALID-EMAIL
"The email is invalid") fell through ETYPECASE expression.
Wanted one of (FUNCTION SYMBOL).
[Condition of type SB-KERNEL:CASE-FAILURE]
Like the comment above says, defclass does not evaluate arguments (it is a macro). While the usual advice is to avoid eval, I think that eval in this circumstance might be exactly what you want. While usually you would splice the form directly into some macro body, with defclass I think the answer is to eval the form in slot initialization and store the evaluation (if it has not yet been evaled).
This would probably occur in:
(defmethod initialize-instance :after ((obj validation-slot)
&key &allow-other-keys)
#| ... |#)
Optionally you could also store the :validation-message and :validation-fn as two separate arguments then call:
(multiple-value-bind (validp msg)
(funcall (funcall (validator-fn slot)
(validator-message slot))
new)
(unless validp
(error msg)))
Another alternative would be to store the evaluation of the form and pass that to the macro:
(defvar *email-validator* (CLAVIER:VALID-EMAIL "The email is invalid"))
(defun email-validator (val)
(funcall *email-validator* val))
Then pass email-validator to defclass.
Additionally I might suggest that your validation functions signal slot-validation-error type conditions instead of error type conditions. Then your condition could contain references to the validator that failed, the value, the slot and the instance. This could give you much better control than the raw error. You could also add some restarts (abort to skip setting the slot, use-value to provide a different value).
Depending on your setup, it might also make more sense for your validation function to signal these directly instead of returning multiple values that are then coerced to signals.

Common Lisp case and quoted elements

I'm writing a dungeon crawler game in CL, and I'm having trouble with the case form.
Two things:
Common Lisp complains Duplicate keyform QUOTE in CASE statement
(make-instance 'cl-rogue:tile tile-type 'wall) should print as "#", but the object prints as " " no matter which tile-type I use.
The code:
(in-package :cl-user)
(defpackage :cl-rogue
(:use :common-lisp)
(:export
:*rows*
:*cols*
:*levels*
:tile
:tile-type
:tile-contents
:tile-hidden
:tile-locked
:tile-closed
:main))
(in-package :cl-rogue)
(defparameter *cols* 80)
(defparameter *rows* 24)
(defparameter *levels* 26)
The class:
(defclass tile ()
((tile-type
:initarg :tile-type
:accessor tile-type
:initform 'floor
:documentation "Type of tile")
(tile-contents
:initarg :tile-contents
:accessor tile-contents
:initform '()
:documentation "Any items the tile holds")
(tile-hidden
:initarg :tile-hidden
:accessor tile-hidden
:initform nil
:documentation "Whether the tile is hidden or shown")
(tile-locked
:initarg :tile-locked
:accessor tile-locked
:initform nil
:documentation "Whether the tile is locked")
(tile-closed
:initarg :tile-closed
:accessor tile-closed
:initform nil
:documentation "Whether the tile is open or closed")))
The print method:
(defmethod print-object ((object tile) stream)
(with-slots (tile-type tile-contents tile-hidden tile-locked tile-closed) object
(if tile-hidden
(format stream " ")
(let ((an-item (car tile-contents)))
(if an-item
(format stream "~a" an-item)
(format stream (case tile-type
('wall "#")
('upstair "<")
('downstair ">")
('door (if tile-closed "+" "\\"))
(otherwise " "))))))))
You don't need to quote the symbols in CASE.
Actually you shouldn't quote symbols in CASE clauses.
See:
CL-USER 31 > (case 'quote
('not-quote 'oops-really-quote)
('quote 'it-is-a-quote))
OOPS-REALLY-QUOTE
Above is the unintended consequence of using 'not-quote instead of the correct unquoted not-quote.
The following is correct:
CL-USER 32 > (case 'quote
(not-quote 'oops-really-quote)
(quote 'it-is-a-quote))
IT-IS-A-QUOTE
The syntax of CASE in Common Lisp
case keyform
{normal-clause}*
[otherwise-clause]
=> result*
normal-clause::= (keys form*)
otherwise-clause::= ({otherwise | t} form*)
clause::= normal-clause | otherwise-clause
keys ::= object | (object*)
As you can see the keys are either a single object or a list of objects. The objects are not evaluated.
Symbols in CASE clauses are not evaluated.
(case tile-type
(wall ...)
(door ...))
WALL and DOOR are purely symbols and not evaluated as variables.
The Lisp reader reads 'fooas (quote foo).
You wrote:
(case tile-type
('wall ...)
('door ...))
Which is the equivalent of:
(case tile-type
((quote wall) ...)
((quote door) ...))
But you can't quote a symbol in CASE. You have to provide the symbols as literal constants.
If you write:
(let ((bar 'foo)
(baz 'foo))
(case bar
(baz :we-have-a-foo-through-baz)
(foo :we-really-have-a-foo)))
This returns :WE-REALLY-HAVE-A-FOO. Because CASE uses constant data, not variables.
CASE accepts a list of items. Since you have QUOTE as a symbol in more than clause, the compiler showed a warning.
As I said, there is no quoting possible, since the items are not evaluated.
As for CASE accepting a list of items in the clauses, it looks like this:
(case tile-type
((door wall) ...)
((floor window painting) ...))
For the WALL symbol, you need to make sure that it is in the right package when you create the object.
Better use a keyword symbol, such as :wall. Then you don't need to export it and there is no confusion about in which package the symbol is.
About the formatting of the code:
You had a bullet list and right after it a code section. This is not rendered as you expect. I have added the text 'The code:' before the code. Then the rendering works as expected.

Hunchentoot/cl-who page composition

Hunchentoot/cl-who Page Composition
I'm trying to put together a few pages in hunchentoot as an experiment, and I'm running into an unexpected wall. As an example, I have the following template macro.
(defmacro page-template ((&key title) &body body)
`(with-html-output-to-string
(*standard-output* nil :prologue t :indent t)
(:html :xmlns "http://www.w3.org/1999/xhtml" :xml\:lang "en" :lang "en"
(:head (:meta :http-equiv "Content-Type" :content "text/html;charset=utf-8")
(:title ,(format nil "~#[~A - ~]Test Site" title)))
(:body ,#body))))
Now when I have a pure text page, or one filled with html literals like
(define-easy-handler (test-page :uri "/") ()
(page-template (:title "Splash Page") (:p "Testing testing")))
everything is a-ok. The page outputs properly and I can see te efforts of my code instantly. However, when I have a page which is made up of redundant elements, it's not as simple. For example, lets say I have a page on which for whatever reason I want to display three RSS newsfeeds. This is a sufficiently complex component that I want to abstract it out, so to my minnd, I should be able to do something like
(define-easy-handler (test-feed :uri "/feeds") ()
(page-template (:title "Splash Page")
(publish-newsfeed "http://nf-one.html")
(publish-newsfeed "http://nf-two.html")
(publish-newsfeed "http://nf-three.html")))
(defmacro publish-newsfeed (url &optional (item-limit 5))
(flet ((get-text (s-tree node-path) (car (last (xmls-tools:find-subtree s-tree node-path)))))
(let ((rss-feed (xmls:parse (drakma:http-request url))))
`(:div :class "rss-feed"
(:a :href ,(get-text rss-feed '("channel" "link")) :target "_top" (:h1 ,(get-text rss-feed '("channel" "title"))))
(:ul ,#(mapcar #'(lambda (item)
`(:li (:a :href ,(get-text item '("link")) :target "_top" (:h2 ,(get-text item '("title"))))
(:p :class "date" ,(get-text item '("pubDate")))
(:p ,(get-text item '("description")))))
(let ((items (xmls-tools:find-all-children (xmls-tools:find-subtree rss-feed '("channel")) "item")))
(if (> (length items) item-limit) (subseq items 0 item-limit) items))))))))
But the result of the above is a "Server Error" page. I'm not quire sure why; page-template is a macro so the calls to publish-newsfeed shouldn't be expanded until they're in the context of with-html-output-to-string. Can anyone tell me what I'm doing wrong?
Also, on closer inspection of the various Hunchentoot/cl-who tutorials, none of them seems to do this kind of page composition. Can anyone with some Hunchentoot experience tell me what the correct/canonical way of decomposing a page into components is?
EDIT:
Correct response by Ramarren below; the with-html-output macros work under different evaluation rules. The version of publish-newsfeed that would actually work in this situation is actually
(defun publish-newsfeed (url &optional (item-limit 5))
(flet ((get-text (s-tree node-path) (car (last (xmls-tools:find-subtree s-tree node-path)))))
(let* ((rss-feed (xmls:parse (drakma:http-request url)))
(items (xmls-tools:find-all-children (xmls-tools:find-subtree rss-feed '("channel")) "item"))
(ltd-items (if (> (length items) item-limit) (subseq items 0 item-limit) items)))
(with-html-output
(*standard-output* nil :indent t)
(:div :class "rss-feed"
(:a :href (get-text rss-feed '("channel" "link")) :target "_top" (:h1 (str (get-text rss-feed '("channel" "title")))))
(:ul (dolist (item ltd-items)
(htm (:li (:h2 (:a :href (get-text item '("link")) :target "_top" (str (get-text item '("title")))))
(:p :class "date" (str (get-text item '("pubDate"))))
(:p (str (get-text item '("description")))))))))))))
Note the removal of mapcar for dolist (I'm a Schemer, don't give me too much of a hard time about liking lambdas, but they weren't the right choice here), and the use of htm to escape blocks of html s-exps (h-exps?) that wouldn't otherwise be in context for with-html-output. Finally, I had to wrap text but NOT :href properties in (str ) to get them to expand dynamically.
The macro with-html-output-to-string expands its body using special evaluation rules. In particular, any not recognized forms are left as is, which means macros are not expanded before the html generating code is generated, which means by the time your publish-newsfeed macro is expanded by the standard compiler it is no longer in context ofwith-html-output-to-string. This is clear when expanding the macro manually, especially using slime macroexpansion feature.
To make it work you should make publish-newsfeed a function and use with-html-output inside it with the same stream (either assume `standard-output everywhere or pass the stream explicitly).

Resources