Preliminary HTML Parser documentation Pending tasks: . integrate with aserve components, such as htmlgen and LHTML description Description The parse-html function processes HTML input, returning a list of HTML tags, attributes, and text. Here is a simple example: (parse-html "
Here is some text with a bold word
and a link
") --> (((:p :here "here" :are "are" :some "some" :attributes "attributes"))) 7. Existing HTML pages often have character format tags that are interleaved among other tags. Such interleaving is removed in a manner consistent with the HTML 4.0 specification. For example, (parse-html "
Here is bold text
that spanstwo paragraphs") --> ((:p "Here is " (:b "bold text")) (:p (:b "that spans") "two paragraphs")) ----------------------------------------------------- parse-html reference parse-html [Generic function] Arguments: input-source &key callbacks callback-only collect-rogue-tags no-body-tags Returns LHTML output, as described above. The callbacks argument, if non-nil, should be an association list. Each list member's car (first) element specifies a keyword package symbol, and each list member's cdr (rest) element specifies a function object or a symbol naming a function. The function should expect one argument. The function will be invoked once for each time the HTML tag corresponding to the specified keyword package symbol is encountered in the HTML input; the argument will be an LHTML list containing the tag, along with associated attributes and content. The default callbacks argument value is nil. The callback-only argument, if non-nil, directs parse-html to not generate a complete LHTML output. Instead, LHTML lists will only be generated when necessary as arguments for functions specified in the callbacks association list. This results in faster parser execution. The default callback-only argument value is nil. The collect-rogue-tags argument, if non-nil, directs parse-html to return an additional value, a list containing any unrecognized tags closed by the end of input. The no-body-tags argument, if non-nil, should be a list containing unknown tags that, if encountered, will be treated as a tag with no body or content, and thus, no associated end tag. Typically, the argument is a list or modified list resulting from an earlier parse-html execution with the :collect-rogue-tags argument specified as non-nil. parse-html Methods parse-html (p stream) &key callbacks callback-only collect-rogue-tags no-body-tags parse-html (str string) &key callbacks callback-only collect-rogue-tags no-body-tags parse-html (file t) &key callbacks callback-only collect-rogue-tags no-body-tags The t method assumes the argument is a pathname suitable for use with the with-open-file macro. phtml-internal [Function] Arguments: stream read-sequence-func callback-only callbacks collect-rogue-tags no-body-tags This function may be used when more control is needed for supplying the HTML input. The read-sequence-func argument, if non-nil, should be a function object or a symbol naming a function. When phtml-internal requires another buffer of HTML input, it will invoke the read-sequence-func function with two arguments - the first argument is an internal buffer character array and the second argument is the phtml-internal stream argument. If read-sequence-fun is nil, phtml-internal will invoke read-sequence to fill the buffer. The read-sequence-func function must return the number of character array elements successfully stored in the buffer.