X-Git-Url: http://git.kpe.io/?p=puri.git;a=blobdiff_plain;f=uri.html;fp=uri.html;h=0000000000000000000000000000000000000000;hp=2cd92ff212149a647f7023c0e30d7f8d853c792a;hb=836c717429785924929dc650171faa391489fee1;hpb=789e972d75dfe1e8432a11a8fa1b13b1b5ecb469 diff --git a/uri.html b/uri.html deleted file mode 100644 index 2cd92ff..0000000 --- a/uri.html +++ /dev/null @@ -1,406 +0,0 @@ - - - -URI support in Allegro CL - - - - -

URI support in Allegro CL

- -

This document contains the following sections:

-

1.0 Introduction
-2.0 The URI API definition
-3.0 Parsing, escape decoding/encoding and the path
-4.0 Interning URIs
-5.0 Allegro CL implementation notes
-6.0 Examples
-

- -

This version of the Allegro CL URI support documentation is for distribution with the -Open Source version of the URI code. Links to Allegro CL documentation other than -URI-specific files have been supressed. To see Allegro CL documentation, see http://www.franz.com/support/documentation/, -which is the Allegro CL documentation page of the franz inc. website. Links to Allegro CL -documentation can be found on that page.

- -
- -
- -

1.0 Introduction

- -

URI stands for Universal Resource Identifier. For a description of -URIs, see RFC2396, which can be found in several places, including the IETF web site (http://www.ietf.org/rfc/rfc2396.txt) and -the UCI/ICS web site (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt). -We prefer the UCI/ICS one as it has more examples.

- -

URIs are a superset in functionality and syntax to URLs (Universal Resource Locators) -and URNs (Universal Resource Names). That is, RFC2396 updates and merges RFC1738 and -RFC1808 into a single syntax, called the URI. It does exclude some portions of RFC1738 -that define specific syntax of individual URL schemes.

- -

In URL slang, the scheme is usually called the `protocol', but it is called -scheme in RFC1738. A URL `host' corresponds to the URI `authority.' The URL slang -`bookmark' or `anchor' is `fragment' in URI lingo.

- -

The URI facility was available as a patch to Allegro CL 5.0.1 and is included with -release 6.0. the URI facility might not be in an Allegro CL image. Evaluate (require -:uri) to ensure the facility is loaded (that form returns nil if the -URI module is already loaded).

- -

Broadly, the URI facility creates a Lisp object that represents a URI, and provides -setters and accessors to fields in the URI object. The URI object can also be interned, -much like symbols in CL are. This document describes the facility and the related -operators.

- -

Aside from the obvious slots which are called out in the RFC, URIs also have a property -list. With interning, this is another similarity between URIs and CL symbols.

- -
- -
- -

2.0 The URI API definition

- -

Symbols naming objects (functions, variables, etc.) in the uri module are -exported from the net.uri package.

- -

URIs are represented by CLOS objects. Their slots are:

- -
-scheme 
-host 
-port 
-path 
-query
-fragment 
-plist 
-
- -

The host and port slots together correspond to the authority -(see RFC2396). There is an accessor-like function, uri-authority, -that can be used to extract the authority from a URI. See the RFC2396 specifications -pointed to at the beginning of the 1.0 Introduction for details -of all the slots except plist. The plist slot contains a -standard Common Lisp property list.

- -

All symbols are external in the net.uri package, unless otherwise noted. -Brief descriptions are given in this document, with complete descriptions in the -individual pages. - -

- -
- -
- -

3.0 Parsing, escape decoding/encoding and the path

- -

The method uri-path returns the path -portion of the URI, in string form. The method uri-parsed-path -returns the path portion of the URI, in list form. This list form is discussed below, -after a discussion of decoding/encoding.

- -

RFC2396 lays out a method for inserting into URIs reserved characters. You do -this by escaping the character. An escaped character is defined like this:

- -
-escaped = "%" hex hex 
-
-hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f" 
-
- -

In addition, the RFC defines excluded characters:

- -
-"<" | ">" | "#" | "%" | <"> | "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`" 
-
- -

The set of reserved characters are:

- -
-";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," 
-
- -

with the following exceptions: - -

- -

From the RFC, there are two important rules about escaping and unescaping (encoding and -decoding): - -

- -

The implication of this is that to decode the URI, it must be in a parsed state. That -is, you can't convert %2f (the escaped form of -"/") until the path has been parsed into its component parts. Another important -desire is for the application viewing the component parts to see the decoded values of the -components. For example, consider:

- -
-http://www.franz.com/calculator/3%2f2 
-
- -

This might be the implementation of a calculator, and how someone would execute 3/2. -Clearly, the application that implements this would want to see path components of -"calculator" and "3/2". "3%2f2" would not be useful to the -calculator application.

- -

For the reasons given above, a parsed version of the path is available and has the -following form:

- -
-([:absolute | :relative] component1 [component2...]) 
-
- -

where components are:

- -
-element | (element param1 [param2 ...]) 
-
- -

and element is a path element, and the param's are path element parameters. -For example, the result of

- -
-(uri-parsed-path (parse-uri "foo;10/bar:x;y;z/baz.htm")) 
-
- -

is

- -
-(:relative ("foo" "10") ("bar:x" "y" "z") "baz.htm") 
-
- -

There is a certain amount of canonicalization that occurs when parsing: - -

- -
- -
- -

4.0 Interning URIs

- -

This section describes how to intern URIs. Interning is not mandatory. URIs can be used -perfectly well without interning them.

- -

Interned URIs in Allegro are like symbols. That is, a string representing a URI, when -parsed and interned, will always yield an eq object. For example:

- -
-(eq (intern-uri "http://www.franz.com") 
-    (intern-uri "http://www.franz.com")) 
-
- -

is always true. (Two strings with identical contents may or may not be eq -in Common Lisp, note.)

- -

The functions associated with interning are: - -

- -
- -
- -

5.0 Allegro CL implementation notes

- -
    -
  1. The following are true:
    - (uri= (parse-uri "http://www.franz.com/")
    -     (parse-uri "http://www.franz.com"))
    - (eq (intern-uri "http://www.franz.com/")
    -    (intern-uri "http://www.franz.com"))
    -
  2. -
  3. The following is true:
    - (eq (intern-uri "http://www.franz.com:80/foo/bar.htm")
    -     (intern-uri "http://www.franz.com/foo/bar.htm"))
    - (I.e. specifying the default port is the same as specifying no port at all. This is - specific in RFC2396.)
  4. -
  5. The scheme and authority are case-insensitive. In Allegro CL, the - scheme is a keyword that appears in the normal case for the Lisp in which you are - executing.
  6. -
  7. #u"..." is shorthand for (parse-uri "...") - but if an existing #u dispatch macro definition exists, it will not be - overridden.
  8. -
  9. The interaction between setting the scheme, host, port, path, query, and fragment slots - of URI objects, in conjunction with interning URIs will have very bad and unpredictable - results.
  10. -
  11. The printable representation of URIs is cached, for efficiency. This caching is undone - when the above slots are changed. That is, when you create a URI the printed - representation is cached. When you change one of the above mentioned slots, the printed - representation is cleared and calculated when the URI is next printed. For example:
  12. -
- -
-user(10): (setq u #u"http://foo.bar.com/foo/bar") 
-#<uri http://foo.bar.com/foo/bar> 
-user(11): (setf (net.uri:uri-host u) "foo.com") 
-"foo.com" 
-user(12): u 
-#<uri http://foo.com/foo/bar> 
-user(13): 
-
- -

This allows URIs behavior to follow the principle of least surprise.

- -
- -
- -

6.0 Examples

- -
-uri(10): (use-package :net.uri)
-t
-uri(11): (parse-uri "foo")
-#<uri foo>
-uri(12): #u"foo"
-#<uri foo>
-uri(13): (setq base (intern-uri "http://www.franz.com/foo/bar/"))
-#<uri http://www.franz.com/foo/bar/>
-uri(14): (merge-uris (parse-uri "foo.htm") base)
-#<uri http://www.franz.com/foo/bar/foo.htm>
-uri(15): (merge-uris (parse-uri "?foo") base)
-#<uri http://www.franz.com/foo/bar/?foo>
-uri(16): (setq base (intern-uri "http://www.franz.com/foo/bar/baz.htm"))
-#<uri http://www.franz.com/foo/bar/baz.htm>
-uri(17): (merge-uris (parse-uri "foo.htm") base)
-#<uri http://www.franz.com/foo/bar/foo.htm>
-uri(18): (merge-uris #u"?foo" base)
-#<uri http://www.franz.com/foo/bar/?foo>
-uri(19): (describe #u"http://www.franz.com")
-#<uri http://www.franz.com> is an instance of #<standard-class net.uri:uri>:
- The following slots have :instance allocation:
-  scheme        :http
-  host          "www.franz.com"
-  port          nil
-  path          nil
-  query         nil
-  fragment      nil
-  plist         nil
-  escaped       nil
-  string        "http://www.franz.com"
-  parsed-path   nil
-  hashcode      nil
-uri(20): (describe #u"http://www.franz.com/")
-#<uri http://www.franz.com> is an instance of #<standard-class net.uri:uri>:
- The following slots have :instance allocation:
-  scheme        :http
-  host          "www.franz.com"
-  port          nil
-  path          nil
-  query         nil
-  fragment      nil
-  plist         nil
-  escaped       nil
-  string        "http://www.franz.com"
-  parsed-path   nil
-  hashcode      nil
-uri(21): #u"foobar#baz%23xxx"
-#<uri foobar#baz#xxx>
-
- -

Copyright (c) 1998-2001, Franz Inc. Berkeley, CA., USA. All rights reserved. -Created 2001.8.16.

- -