X-Git-Url: http://git.kpe.io/?p=xmlutils.git;a=blobdiff_plain;f=pxml.htm;fp=pxml.htm;h=0000000000000000000000000000000000000000;hp=2cf26d5ea41e8350e437563c4abfce30a2510b7a;hb=a1d0f28e9281bfc2d70b43b62e178f3d0da1114b;hpb=b5da6339c28ee272d0a32eb5c26a9f7446e71d9f diff --git a/pxml.htm b/pxml.htm deleted file mode 100644 index 2cf26d5..0000000 --- a/pxml.htm +++ /dev/null @@ -1,387 +0,0 @@ - - -
-A Lisp Based XML Parser
- -Introduction/Simple Example
-LXML parse output format
-parse-xml non-validating parser properties
-case and international character support issues
-parse-xml and packages
-parse-xml, the XML Namespace specification, and packages
-ACL does not support Unicode 4 byte scalar values
-only little-endian Unicode tested in ACL 6.0 beta
-debugging aids
-XML Conformance test results
-Compiling and Loading the parser
-parse-xml reference
The parse-xml generic function processes XML
-input, returning a list of XML tags,
-attributes, and text. Here is a simple example:
-
-(parse-xml "<item1><item2 att1='one'/>this is some
-text</item1>")
-
--->
-
-((item1 ((item2 att1 "one")) "this is some text"))
-
-The output format is known as LXML format.
-
-LXML Format
-
-LXML is a list representation of XML tags and content.
-
-Each list member may be:
-
-a. a string containing text content, such as "Here is some text with a "
-
-b. a list representing a XML tag with associated attributes and/or content,
-such as ('item1 "text") or (('item1 :att1 "help.html")
-"link"). If the XML tag
-does not have associated attributes, then the first list member will be a
-symbol representing the XML tag, and the other elements will
-represent the content, which can be a string (text content), a symbol (XML
-tag with no attributes or content), or list (nested XML tag with
-associated attributes and/or content). If there are associated attributes,
-then the first list member will be a list containing a symbol
-followed by two list members for each associated attribute; the first member is a
-symbol representing the attribute, and the next member is a string corresponding
-to the attribute value.
-
-c. XML comments and or processing instructions - see the more detailed example below for
-further information.
Non Validating Parser Properties
- -Parse-xml is a non-validating XML parser. It will detect non-well-formed XML input.
-When
-processing valid XML input, parse-xml will optionally produce the same output as a
-validating
-parser would, including the processing of an external DTD subset and external entity
-declarations.
-
-By default, parse-xml outputs a DTD parse along with the parsed XML contents. The DTD
-parse may
-be optionally suppressed. The following example shows DTD parsed output components:
(defvar *xml-example-external-url*
- "<!ENTITY ext1 'this is some external entity %param1;'>")
-
-(defun example-callback (var-name token &optional public)
- (declare (ignorable token public))
- (setf var-name (uri-path var-name))
- (if* (equal var-name "null") then nil
- else
- (let ((string (eval (intern var-name (find-package
-:user)))))
- (make-string-input-stream string))))
-
-(defvar *xml-example-string*
-"<?xml version='1.0' encoding='utf-8'?>
-<!-- the following XML input is well-formed but its validity has not been checked ...
--->
-<?piexample this is an example processing instruction tag ?>
-<!DOCTYPE example SYSTEM '*xml-example-external-url*' [
- <!ELEMENT item1 (item2* | (item3+ , item4))>
- <!ELEMENT item2 ANY>
- <!ELEMENT item3 (#PCDATA)>
- <!ELEMENT item4 (#PCDATA)>
- <!ATTLIST item1
- att1 CDATA #FIXED 'att1-default'
- att2 ID #REQUIRED
- att3 ( one | two | three ) 'one'
- att4 NOTATION ( four | five ) 'four' >
- <!ENTITY % param1 'text'>
- <!ENTITY nentity SYSTEM 'null' NDATA somedata>
- <!NOTATION notation SYSTEM 'notation-processor'>
- ]>
-<item1 att2='1'><item3>&ext1;</item3></item1>")
-
-(pprint (parse-xml *xml-example-string* :external-callback 'example-callback))
-
--->
-
-((:xml :version "1.0" :encoding "utf-8")
- (:comment " the following XML input is well-formed but may or may not be valid
-")
- (:pi :piexample "this is an example processing instruction tag ")
- (:DOCTYPE :example
- (:[ (:ELEMENT :item1 (:choice (:* :item2) (:seq (:+ :item3) :item4)))
- (:ELEMENT :item2 :ANY)
- (:ELEMENT :item3 :PCDATA) (:ELEMENT :item4
-:PCDATA)
- (:ATTLIST item1 (att1 :CDATA :FIXED
-"att1-default") (att2 :ID :REQUIRED)
- (att3
-(:enumeration :one :two :three) "one")
- (att4 (:NOTATION
-:four :five) "four"))
- (:ENTITY :param1 :param "text")
- (:ENTITY :nentity :SYSTEM "null"
-:NDATA :somedata)
- (:NOTATION :notation :SYSTEM
-"notation-processor"))
- (:external (:ENTITY :ext1 "this is some external entity
-text")))
- ((item1 att1 "att1-default" att2 "1" att3 "one"
-att4 "four")
- (item3 "this is some external entity
-text")))
-
-
-Usage Notes
-
-
(setf *xml-example-string4*
- "<bibliography
- xmlns:bib='http://www.bibliography.org/XML/bib.ns'
- xmlns='urn:com:books-r-us'>
- <bib:book owner='Smith'>
- <bib:title>A Tale of Two Cities</bib:title>
- <bib:bibliography
- xmlns:bib='http://www.franz.com/XML/bib.ns'
- xmlns='urn:com:books-r-us'>
- <bib:library branch='Main'>UK
-Library</bib:library>
- <bib:date calendar='Julian'>1999</bib:date>
- </bib:bibliography>
- <bib:date calendar='Julian'>1999</bib:date>
- </bib:book>
-</bibliography>")
-
-(setf *uri-to-package* nil)
-(setf *uri-to-package*
- (acons (parse-uri "http://www.bibliography.org/XML/bib.ns")
- (make-package "bib") *uri-to-package*))
-(setf *uri-to-package*
- (acons (parse-uri "urn:com:books-r-us")
- (make-package "royal") *uri-to-package*))
-(setf *uri-to-package*
- (acons (parse-uri "http://www.franz.com/XML/bib.ns")
- (make-package "franz-ns") *uri-to-package*))
-(pprint (multiple-value-list
- (parse-xml
-*xml-example-string4*
- :uri-to-package
-*uri-to-package*)))
-
--->
-((((bibliography |xmlns:bib| "http://www.bibliography.org/XML/bib.ns"
- xmlns "urn:com:books-r-us")
- "
- "
- ((bib::book royal::owner "Smith") "
- " (bib::title "A Tale of Two
-Cities") "
- "
- ((bib::bibliography royal::|xmlns:bib|
- "http://www.franz.com/XML/bib.ns" royal::xmlns
- "urn:com:books-r-us")
- "
- " ((franz-ns::library royal::branch
-"Main") "UK Library") "
- " ((franz-ns::date royal::calendar
-"Julian") "1999") "
- ")
- "
- " ((bib::date royal::calendar
-"Julian") "1999") "
- ")
- "
- "))
-((#<uri http://www.franz.com/XML/bib.ns> . #<The franz-ns package>)
- (#<uri urn:com:books-r-us> . #<The royal package>)
- (#<uri http://www.bibliography.org/XML/bib.ns> . #<The bib package>)))
-
-
-(defun file-callback (uri-object token &optional public) - ;; The uri-object is an ACL URI object created from - ;; the XML input. In this example, this function - ;; assumes that all uri's will be file specifications. - ;; - ;; The token argument identifies what token is associated - ;; with the external parse (for example :DOCTYPE for external - ;; DTD subset - ;; - ;; The public argument contains the associated PUBLIC string, - ;; when present - ;; - (declare (ignorable token public)) - ;; An open stream is returned on success, - ;; a nil return value indicates that the external - ;; parse should not occur. - ;; Note that parse-xml will close the open stream before exiting. - (ignore-errors (open (uri-path uri-object)))) --
-The general-entities argument is an association list containing general entity symbol -and replacement text pairs. The entity symbols should be in the keyword package. -Note that this option may be useful in generating desirable parse results in -situations where you do not wish to parse external entities or the external DTD subset. -
-The parameter-entities argument is an association list containing parameter entity symbol -and replacement text pairs. The entity symbols should be in the keyword package. -Note that this option may be useful in generating desirable parse results in -situations where you do not wish to parse external entities or the external DTD subset. -
-The uri-to-package argument is an association list containing uri objects and package -objects. Typically, the uri objects correspond to XML Namespace attribute values, and -the package objects correspond to the desired package for interning symbols associated -with the uri namespace. If the parser encounters an uri object not contained in this list, -it will generate a new package. The first generated package will be named -net.xml.namespace.0, -the second will be named net.xml.namespace.1, and so on. -
-(parse-xml (p stream) &key - external-callback content-only - general-entities - parameter-entities - uri-to-package) - -(parse-xml (str string) &key - external-callback content-only - general-entities - parameter-entities - uri-to-package) --An easy way to parse a file containing XML input: -
-(with-open-file (p "example.xml") - (parse-xml p :content-only p)) --
-*debug-xml*
-
-When true, parse-xml generates XML lexical state and intermediary
-parse result debugging output.
-
-*debug-dtd*
-
-When true, parse-xml generates DTD lexical state and intermediary
-parse result debugging output.
-
-