This program defines the HTTP protocol scheme for REBOL 3.
In REBOL, a scheme is a set of definition and functions that handle the operations of a kind of port. This scheme handles the HTTP ports, allowing access to resources served via HTTP. Version 1.1 of the HTTP protocol is supported (although some features are not yet implemented at this point).
The scheme is defined by calling the make-scheme function. Our port spec adds the following words to the standard network port spec:
We also add the following words to the standard file info object:
When actor is defined as a block, make-scheme uses append to define it so that the function bodies are not bound to the actor context.
〈Overview〉 ≡
〈Support functions〉
make-scheme [
name: 'http
title: "HyperText Transport Protocol v1.1"
spec: make system/standard/port-spec-net [
path: %/
method: 'get
headers: [ ]
content: none
timeout: 15 ; used for sync ops only
]
info: make system/standard/file-info [
response-line:
headers: none
]
actor: [
〈Actor functions〉
]
]
HTTP ports can be accessed at different levels of abstraction. In the higher level, you mostly want to read the contents of a HTTP URL, synchronously, with the system handling things like redirects, authentication, cookies and so on automatically. (Note that cookies do not automatically persist across sections; see [not yet implemented] for more info.)
In the lower level, you can do multiple requests with a single connection, and work asynchronously. The system still handles most of the stuff for you, but you have much more freedom in the handling of the responses. This is also useful to implement HTTP-based protocols, like WebDav.
〈Actor functions〉 ≡
read is the highest level actor, when called on an url! or a port! without an awake function defined. In these cases, it is synchronous, and just does a GET request and returns the content.
〈Higher level actors〉 ≡
read: func [
port [port!]
; refinements not handled yet
] [
either any-function? :port/awake [
〈Asynchronous read〉
] [
sync-op port [ ]
]
]
write is still high level when called on an url! or a port that does not have an awake function (sync behavior). It allows specifying the method, headers and content of the HTTP request, then returns the content of the response. It is very similar to read.
value can be a block, where you can specify the HTTP method (as a word!) for the request (POST by default), the target (as a file! or url!, port/spec/path by default), any additional headers (as a block!, gets evaluated to an object) and the content for the request (any-string!); if you pass any other value, it gets form'ed (while any-string! values are sent as-is) and sent as the content of a POST request (with a Content-Type of "application/x-www-form-urlencoded").
〈Higher level actors〉 +≡
write: func [
port [port!]
value
] [
unless any [block? :value any-string? :value] [value: form :value]
unless block? value [value: reduce [[Content-Type: "application/x-www-form-urlencoded"] value]]
either any-function? :port/awake [
〈Asynchronous write〉
] [
sync-op port [parse-write-dialect port value]
]
]
This function performs a sync operation on the port (eg. read or write). It sets up an awake function for the HTTP port (that will take care of performing the actual request when the port is ready, and returning from wait when the response is complete), opens and sets up the port if necessary, does the body block, then just waits and returns the received data. See 〈The read-sync-awake function〉 for the read-sync-awake function.
〈The sync-op function〉 ≡
sync-op: func [port body] [
port/awake: :read-sync-awake
unless port/state [open port port/state/close?: yes]
do body
; if we're ready already, perform the request
if port/state/state = 'ready [do-request port]
unless port? wait [port/state/connection port/spec/timeout] [cause make error! "Timeout"]
if port/state/close? [close port]
port/awake: none ; reset so that next op will be sync too
copy port
]
The awake function for synchronous read is very simple, and just starts the request when the port is ready, and returs from wait when the response is complete. For do-request, see 〈Support functions〉.
〈The read-sync-awake function〉 ≡
read-sync-awake: func [event [event!] /local error] [
switch/default event/type [
; read and wrote can be used to show progress
connect ready [
; ready for a request
; users would call read port here - we optimize by calling do-request directly
do-request event/port
false
]
done [
; response is complete, data is ready
true
]
; do we need a close event??
error [
; users can get the error using query - we get it directly
error: event/port/state/error
event/port/state/error: none
cause error
]
; custom event can be used for custom response processing and
; handling of redirects
] [
false
]
]
The open actor is the one responsible of initializing the port and opening the TCP connection to the server. It is either called directly by users (on an url! or port! etc.), or indirectly when doing a read or write to an url! and so on.
Notice that open is always async, and only initiates the TCP connection; it returns immediately after setting up an awake function for the TCP port (see 〈The awake function for the TCP connection〉), so you have to wait on the port for the connection to actually happen.
〈Lower level actors〉 ≡
open: func [
port [port!]
; refinements not handled yet
/local conn
] [
; noop if already open - should throw error?
if port/state [return port]
port/state: context [
state: 'inited
connection:
error: none
close?: no
info: make port/scheme/info [ ]
]
port/state/connection: conn: make port! [
scheme: 'tcp
host: port/spec/host
port-id: port/spec/port-id
]
conn/awake: :http-awake
conn/locals: port ; locals is a bad name - need to be able to use custom word
open conn
port
]
read and write can also behave as async functions when used on an opened port. See 〈Asynchronous read〉 and 〈Asynchronous write〉.
The open? actor checks if the port is open.
〈Lower level actors〉 +≡
open?: func [
port [port!]
] [
; not sure here
found? all [port/state open? port/state/connection]
]
The close actor closes an open port.
〈Lower level actors〉 +≡
close: func [
port [port!]
] [
if port/state [
close port/state/connection
port/state/connection/awake: none
port/state: none
]
port
]
The copy actor returns the port data, unless the request method was HEAD, in which case the response header is returned. You can use query to obtain more info.
〈Lower level actors〉 +≡
copy: func [
port [port!]
] [
either all [port/spec/method = 'head port/state] [
reduce [port/state/info/response-line port/state/info/headers]
] [
if port/data [copy port/data]
]
]
query can be used at any time to obtain information from the port. In case there has been an error ('error event type), the actual error! value is returned; otherwise it just returns the port/state/info object.
Note that query also resets the error condition.
〈Lower level actors〉 +≡
query: func [
port [port!]
/local error
] [
either error? error: port/state/error [
port/state/error: none
error
] [
port/state/info
]
]
length? returns the number of bytes of content that have been read so far. This does not include the header, or the overhead for the chunked transfer, and so on.
〈Lower level actors〉 +≡
length?: func [
port [port!]
] [
either port/data [length? port/data] [0]
]
The http-awake function is set as the awake function of the TCP connection used by the HTTP port. It handles the events on the connection, reads the server's response and parses it, and sends events to the HTTP port's awake function.
〈The awake function for the TCP connection〉 ≡
http-awake: func [event /local port http-port d1 d2 h] [
port: event/port
http-port: port/locals
switch/default event/type [
read [
http-port/awake make event! [type: 'read port: http-port]
check-response http-port
]
wrote [
http-port/awake make event! [type: 'wrote port: http-port]
http-port/state/state: 'reading-headers
read port
false
]
lookup [open port false]
connect [
http-port/state/state: 'ready
http-port/awake make event! [type: 'connect port: http-port]
]
; close event does not seem to work...
] [true]
]
read can be used for async reads. It does only work when called on a port in ready state, so should be called from the HTTP port's awake function when you get the 'connect or 'ready events. It will inintiate the HTTP request; you'll get 'write events while the request is being written, 'wrote when it has been sent, and 'read events while data is being read (useful for progress indicators - you can query the port to obtain info such as the response line, and length? port to get the number of bytes of the content that have been read so far). When all the response data has been downloaded, you'll get a 'done event.
〈Asynchronous read〉 ≡
unless open? port [cause-error 'Access 'not-open port/spec/ref]
; should be something like "port is not ready"
if port/state/state <> 'ready [cause make error! "Port not ready"]
do-request port
port
write can be used for async writes. It's almost the same as read except that you can specify custom values for the request (see 〈Higher level actors〉 and 〈Support functions〉).
〈Asynchronous write〉 ≡
unless open? port [cause-error 'Access 'not-open port/spec/ref]
if port/state/state <> 'ready [cause make error! "Port not ready"]
parse-write-dialect port value
do-request port
port
This section contains all the support functions used by the scheme.
〈Support functions〉 ≡
〈The sync-op function〉
〈The read-sync-awake function〉
〈The awake function for the TCP connection〉
make-http-request is used to generate the actual request sent to the server. Note, that the whole request is being generated in memory, then sent to the server. In some cases it may be desirable to generate the request incrementally. We plan to support this in the near future.
〈Support functions〉 +≡
make-http-request: func [
"Create an HTTP request (returns string!)"
method [word! string!] "E.g. GET, HEAD, POST etc."
target [file! string!] {In case of string!, no escaping is performed (eg. useful to override escaping etc.). Careful!}
headers [block!] "Request headers (set-word! string! pairs)"
; maybe allow content to be a function that can insert directly into output,
; for maximum efficiency (no Content-Length in that case though)
content [any-string! none!] {Request contents (Content-Length is created automatically). Empty string not exactly like none.}
/local result
] [
result: rejoin [
uppercase form method #" "
; needs better escaping!!
either file? target [next mold target] [target]
" HTTP/1.1" CRLF
]
foreach [word string] headers [
repend result [mold word #" " string CRLF]
]
if content [
repend result ["Content-Length: " length? content CRLF]
]
append result CRLF
if content [append result content]
result
]
The do-request function makes the HTTP request to the server.
Need to change it so that it can write more than 32000 bytes per request.
〈Support functions〉 +≡
do-request: func [
"Perform an HTTP request"
port [port!]
] [
port/spec/headers: third make make object! [
Accept: "*/*"
Host: either port/spec/port-id <> 80 [
rejoin [form port/spec/host #":" port/spec/port-id]
] [
form port/spec/host
]
User-Agent: "REBOL"
] port/spec/headers
port/state/state: 'doing-request
port/state/info/headers: port/state/info/response-line: port/data: none
write port/state/connection
make-http-request port/spec/method to file! any [port/spec/path %/]
port/spec/headers port/spec/content
]
The parse-write-dialect function parses the block passed to write and extracts the request method, request target, request headers and request content. The values in the port object are set accordingly so that they can be used later on by the do-request function.
〈Support functions〉 +≡
parse-write-dialect: func [port block] [
parse block [
[set block word! (port/spec/method: block) | (port/spec/method: 'post)]
opt [set block [file! | url!] (port/spec/path: block)]
[set block block! (port/spec/headers: block) | (port/spec/headers: [ ])]
[set block any-string! (port/spec/content: block) | (port/spec/content: none)]
]
]
check-response is the function that checks the response header and handles server errors, redirects, and so on. Returns true if we are done (ie, port/data is the actual requested content).
〈Support functions〉 +≡
check-response: func [port /local conn res h line] [
conn: port/state/connection
h: port/state/info/headers
line: port/state/info/response-line
〈Check if header has been received, and parse it〉
〈If header has not been received completely yet, read more data〉
res: false
parse/all line [
"HTTP/1." [#"0" | #"1"] some #" " [
#"1" (〈Handle Informational response〉)
|
#"2" [〈Handle Success responses〉]
|
#"3" [〈Handle Redirect responses〉]
|
#"4" [〈Handle Client Error responses〉]
|
#"5" (〈Handle Server Error responses〉)
]
|
(
port/state/error: make error! "HTTP response version not supported"
res: port/awake make event! [type: 'error port: port]
close port
)
]
res
]
〈Check if header has been received, and parse it〉 ≡
if all [
; assuming HTTP 1.x response - need to handle legacy HTTP 0.9 too
not h
d1: find conn/data crlf
d2: find/tail d1 crlf2
] [
port/state/info/response-line: line: as-string copy/part conn/data d1
port/state/info/headers: h: construct/with d1 http-response-headers
if h/content-length [h/content-length: to integer! h/content-length]
remove/part conn/data d2
port/state/state: 'reading-data
]
〈If header has not been received completely yet, read more data〉 ≡
unless h [
read conn
return false
]
Informational (1xx) responses are just ignored for now.
〈Handle Informational response〉 ≡
port/state/info/headers: port/state/info/response-line: port/data: none
port/state/state: 'reading-headers
read conn
〈Handle Success responses〉 ≡
["04" | "05"] ( ; No Content / Reset Content
port/state/state: 'ready
〈Send 'done and 'ready events〉
)
|
( ; Response has content
either port/spec/method = 'head [
; no need to read content
port/state/state: 'ready
〈Send 'done and 'ready events〉
] [
res: check-data port
if all [not res port/state/state = 'ready] [〈Send 'done and 'ready events〉]
]
)
〈Handle Redirect responses〉 ≡
"03" ( ; See other
either port/spec/method = 'head [
port/state/state: 'ready
res: port/awake make event! [type: 'custom port: port code: 0]
] [
res: check-data port
]
if all [not res port/state/state = 'ready] [
either all [in h 'Location not res] [
port/spec/method: 'get
res: do-redirect port h/location
] [
port/state/error: make error! "Redirect requires manual intervention"
res: port/awake make event! [type: 'error port: port]
]
]
)
|
"04" (port/state/state: 'ready 〈Send 'done and 'ready events〉) ; Not modified
|
"05" ( ; Use proxy
port/state/state: 'ready
port/state/error: make error! "Proxies not supported yet"
res: port/awake make event! [type: 'error port: port]
)
|
( ; Any other redir
either port/spec/method = 'head [
port/state/state: 'ready
res: port/awake make event! [type: 'custom port: port code: 0]
] [
res: check-data port
]
if all [not res port/state/state = 'ready] [
either all [
; automatic redir only for GET or HEAD
find [get head] port/spec/method
in h 'Location
] [
res: do-redirect port h/location
] [
port/state/error: make error! "Redirect requires manual intervention"
res: port/awake make event! [type: 'error port: port]
]
]
)
〈Handle Client Error responses〉 ≡
( ; in all cases...
either port/spec/method = 'head [
port/state/state: 'ready
] [
check-data port
]
)
"01" ( ; Unauthorized
port/state/error: make error! "Authentication not supported yet"
res: port/awake make event! [type: 'error port: port]
)
|
"07" ( ; Proxy auth required
port/state/error: make error! "Authentication and proxies not supported yet"
res: port/awake make event! [type: 'error port: port]
)
|
; Any other case
(
port/state/error: make error! join "Server error: " line
res: port/awake make event! [type: 'error port: port]
)
〈Handle Server Error responses〉 ≡
either port/spec/method = 'head [
port/state/state: 'ready
] [
check-data port
]
port/state/error: make error! join "Server error: " line
res: port/awake make event! [type: 'error port: port]
〈Send 'done and 'ready events〉 ≡
res: port/awake make event! [type: 'done port: port]
unless res [res: port/awake make event! [type: 'ready port: port]]
The above function needs crlf2 and the http-response-headers prototype object.
〈Support functions〉 +≡
crlf2: {^M
^M
}
http-response-headers: context [
Content-Length:
Transfer-Encoding: none
]
do-redirect handles redirections.
〈Support functions〉 +≡
do-redirect: func [port [port!] new-uri [url! string! file!]] [
new-uri: construct/with decode-url new-uri port/scheme/spec
if new-uri/scheme <> 'http [
port/state/error: make error! {Redirect to a protocol different from HTTP not supported}
return port/awake make event! [type: 'error port: port]
]
either all [
; we should probably check that the IP matches,
; not just the host name, however that requires an extra lookup...
new-uri/host = port/spec/host
new-uri/port-id = port/spec/port-id
] [
; redirect to same host
port/spec/path: new-uri/path
do-request port
false
] [
port/state/error: make error! "Redirect to other host - requires custom handling"
port/awake make event! [type: 'error port: port]
]
]
check-data checks that the response content has been transferred completely.
〈Support functions〉 +≡
check-data: func [port /local h res data out chunk-size mk1 mk2 trailer] [
h: port/state/info/headers
res: false
case [
; transfer encoding needs to have precedence over content length
h/transfer-encoding = "chunked" [
〈Handle chunked transfer encoding〉
]
integer? h/content-length [
port/data: port/state/connection/data
either h/content-length <= length? port/data [
port/state/state: 'ready
port/state/connection/data: make binary! 32000
res: port/awake make event! [type: 'custom port: port code: 0]
] [
read port/state/connection
]
]
true [
; it is assumed that the server will close the connection
port/data: port/state/connection/data
read port/state/connection
]
]
res
]
〈Handle chunked transfer encoding〉 ≡
data: port/state/connection/data
out: port/data: make binary! length? data
until [
either parse/all data [
copy chunk-size some hex-digits thru crlf mk1: to end
] [
chunk-size: to integer! to issue! chunk-size
either chunk-size = 0 [
if parse/all mk1 [
crlf (trailer: "") to end | copy trailer to crlf2 to end
] [
trailer: construct trailer
append h third trailer
port/state/state: 'ready
res: port/awake make event! [type: 'custom port: port code: 0]
clear data
]
true
] [
either parse/all mk1 [
chunk-size skip mk2: crlf to end
] [
insert/part tail out mk1 mk2
remove/part data as-binary skip mk2 2
empty? data
] [
true
]
]
] [
true
]
]
unless port/state/state = 'ready [read port/state/connection]
Charset used by check-data:
〈Support functions〉 +≡
hex-digits: charset "1234567890abcdefABCDEF"