Sample Janus Sockets programs: Difference between revisions

From m204wiki
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 461: Line 461:
==See also==
==See also==
<ul>
<ul>
<li>[[Sample Janus Sockets programs]]
<li>[[Janus Sockets User Language coding considerations]]
<li>Socket-level interfaces:
<li>Socket-level interfaces:
<ul>
<ul>

Latest revision as of 15:14, 26 April 2013

Retrieving web pages: HTTP, HTML

The examples in this section show techniques for composing HTTP requests, sending them to a web server, and retrieving the response.

The current specification for HTTP (HTTP/1.1) gives something like the following simple structure for using HTTP:

  1. Client ("browser") sends a request, which contains:
    1. The request line, ended by CRLF (x'0D0A'), and containing:
      1. The method, for example, GET to get a web page
      2. The ID, or "URL"
      3. The protocol version, for example, HTTP/1.1
    2. Additional header lines, each ended by CRLF
    3. An empty line, that is, an immediate CRLF
    4. The body
  2. Server interprets the request and sends a response, which contains:
    1. The response status line, ended by CRLF, and containing:
      1. The HTTP version ID (for example, HTTP/1.1), followed by spaces
      2. The status code number, followed by spaces
      3. The status code message
    2. Additional header lines, each ended by CRLF
    3. An empty line, that is, an immediate CRLF
    4. The body

Header lines contain a case-insensitive field name, followed by a colon (:), and spaces followed by the field value.

The ASCII string of carriage-return followed by linefeed (0D0A) is used in the HTTP protocol to separate header lines and to separate the header from the body. The response body is the web "page," which may contain a variety of data. Frequently, the page will contain HTML, which is frequently separated into "lines;" however, the separators of the lines are commonly either 0D0A, 0D, or 0A. Although this presents ambiguous parse tokens, as discussed in "Ambiguous PRSTOK strings", the pitfalls of the general request-response protocols do not apply with a single request-response when the only ambiguity occurs in the response stream.

Simple echo of HTTP/HTML

This example shows a complete program that communicates with a remote web server and that parses and prints the returned HTML stream.

Note that the third argument to $Sock_RecvPrs is -1, which means that there is no limit to the length of each line parsed, hence no limit to the number of bytes discarded at the end of the line (only the first 78 bytes (%s size) of each line of HTML is examined by the User Language program). "Echo of HTTP/HTML with continuation lines" has an example that examines complete lines, printing continuation lines as needed.

JANUS DEFINE MYBROWSE * CLSOCK 60 REMOTE * 80 LINEND 0D0A JANUS START MYBROWSE Begin %socket Is Float %rc Is Float %host Is String Len 255 %page Is String Len 255 %s Is String Len 78 %host = 'www.sirius-software.com' %page = 'main.html' %socket = $sock_conn('TEST', %host) If %socket Lt 0 Then Stop End If %rc = $sock_send(%socket, 'GET /') %rc = $sock_send(%socket, %page) %rc = $sock_sendln(%socket, ' HTTP/1.0') * Null line indicates end of request: %rc = $sock_sendln(%socket, ) %s = $sock_set(%socket, 'PRSTOK', 'AMBIG|0D0A|0A|0D') Repeat %rc = $sock_recvprs(%socket, %s, -1) Print %s If %rc Le 0 Then Loop End End If End Repeat %rc = $sock_close(%socket) End

Echo of HTTP/HTML with continuation lines

This example shows a complete program that communicates with a remote web server and which parses and prints the returned HTML stream. This program differs from that in [[#Simple echo of HTTP/HTML|"imple echo of HTTP/HTML"] because it examines complete lines, printing continuation lines as needed.

The fourth argument to $Sock_RecvPrs (%x in the example below) is the index of the separator string found in the received data. When this value is 0, it means that, within the data that has been received upon return from $Sock_RecvPrs, no separator has been found. This indicates one of the following:

  • All data has been received (or all data within RECVLIM, which is not used in this example)
  • The limit set by its third argument (here defaulting to 0, which means the size of the second argument) caused $Sock_RecvPrs to return before it might have found a separator.

This information (%x Eq 0) works adequately for the example.

JANUS DEFINE MYBROWSE * CLSOCK 60 REMOTE * 80 LINEND 0D0A JANUS START MYBROWSE Begin %socket Is Float %rc Is Float %x Is Float %host Is String Len 255 %page Is String Len 255 %s Is String Len 75 %pre Is String Len 2 %host = 'www.sirius-software.com' %page = 'main.html' %socket = $sock_conn('TEST', %host) If %socket Lt 0 Then Stop End If %rc = $sock_send(%socket, 'GET /') %rc = $sock_send(%socket, %page) %rc = $sock_sendln(%socket, ' HTTP/1.0') * Null line indicates end of request: %rc = $sock_sendln(%socket, ) %s = $sock_set(%socket, 'PRSTOK', 'AMBIG|0D0A|0A|0D') %pre = ' ' Repeat %rc = $sock_recvprs(%socket, %s, , %x) Print %pre And %s If %rc Le 0 Then Loop End End If %pre = ' ' If %x Eq 0 Then %pre = '->' End If End Repeat %rc = $sock_close(%socket) End

Request-response protocol

The first parts of this section exhibit a very simple protocol which can be used between any two sockets platforms. This example is shown using Janus Sockets for both the client and server, but either the client or the server portion can be easily implemented on any platform with a sockets interface.

This protocol makes use of the $Sock_RecvPrs function to receive the client request and server response. The request consists of multiple strings, each separated by an "end of line" sequence. This is a natural sequence for various platforms; for example, you might want to copy the contents of a text file as some of the data in the request.

In some environments, end of line is X'0D' and in others X'0D0A'. Since one line-end alternative is a prefix of another, it is necessary to use the AMBIG specification on the PRSTOK parameter. The last part of this section ("Ambiguous PRSTOK strings") discusses some considerations for ambiguous PRSTOK parameters.

Single request client code

This client sends a single request and receives a response. A request contains an action, followed by the data for the request, followed by a distinct line for the end of the request. After sending the request, it receives the response from the server.

This example could easily be written for another sockets platform, and the data could be conveniently obtained from a text file.

As discussed in "Ambiguous PRSTOK strings", it is important to use an unambiguous separator string at the end of the request, one that is not a prefix of any other separator. In our protocol, we are using a single byte of hex 07 (the ASCII "Bell" character) as this separator to "turn around" the connection from both the client and server perspective.

JANUS DEFINE SOCKA * CLSOCK 10 REMOTE SOCKA_SRV 7733 JANUS START SOCKA Begin %x Float %r Float %status String Len 10 %x = $sock_conn('SOCKA') If %x Lt 0 Then Stop End If %r = $sock_sendln(%X, 'ACTION1') Fr Where RECTYPE = 'MYDATA' %r = $sock_sendln(%x, DATA) End For %r = $sock_sendln(%x, 'GO', 'LINEnd 07') %r = $sock_recvprs(%x, %status, , 'PRSTOK 07') If %status Ne 'GOOD' Then Stop End If %r = $sock_close(%x)

Single request server code

The server receives a single request and sends a response. See details about the protocol in "Single request client code".

Note: As always, the connection initiating SRVSOCK processing is accessible as socket number 1.

JANUS DEFINE SOCKS 7733 SRVSOCK 10 CMD 'INCLUDE SINGLE_REQ' JANUS START SOCKS PROCEDURE SINGLE_REQ Begin %r Float %sepx Float %data String Len 255 %r = $sock_set(1, 'PRSTOK', 'AMBIG|0D0A|0D|0A|07') %r = $sock_set(1, 'LINEND', '07') %r = $sock_recvprs(1, %data) If %r Le 0 Then Stop End If If %data Eq 'ACTION1' Then Repeat %r = $sock_Recvprs(1, %data, %sepx) If %r Le 0 Then Stop Elseif %sepx Le 4 Then If %data Ne 'GO' Then %r = $sock_sendln(1, 'BAD') Else %r = $sock_sendln(1, 'GOOD') End If Stop End If ... processing a line of data End Repeat Else If %data Eq ... other operation Repeat %r = $sock_recvprs(1, %data, %sepx) ... (same structure as above) End Repeat Else If %data Eq ... other operations ... End If %r = $sock_close(1) End END PROCEDURE

Multiple request client code

This client sends a series of requests and receives a response from each one. The protocol is the same as shown in "Single request client code", except that at the end of the series of requests it sends a special action to indicate the end of the conversation.

In this example, the use of the unambiguous separator at the end of the response is more important than in the previous example, because the server is going to receive after sending the response. With a single request-response protocol, as long as you are sure that the server will close the connection after sending the response, the client will receive the complete response even if it is terminated by an ambiguous separator.

It is good practice, however, to use unambiguous separators to end each request and response, since, for example, the server may be poorly designed and the connection could stay open for an extended time.

JANUS DEFINE SOCKA * CLSOCK 10 REMOTE SOCKA_SRV 7733 JANUS START SOCKA Begin %x Float %r Float %status String Len 10 %x = $sock_conn('SOCKA') If %x Lt 0 Then Stop End If %r = $sock_set(%x, 'PRSTOK', '07') Fr Where RECTYPE = 'REQUEST' %r = $sock_sendln(%x, REQ_TYPE) D: FEO DATA_LINE %r = $sock_sendln(%x, VALUE IN D) End For %r = $sock_sendln(%x, 'GO', 'LINEND 07') %r = $sock_recvprs(%x, %status) If %status Ne 'GOOD' Then Stop End If End For %r = $sock_sendln(%x, 'DONE') %r = $sock_close(%x)

Multiple request server code

The server receives a series of requests and sends a response for each one. The protocol is the same as shown in "Single request client code", except that at the end of the series of requests the client sends a special action to indicate the end of the conversation.

Note: As always, the connection initiating SRVSOCK processing is accessible as socket number 1.

JANUS DEFINE SOCKS 7733 SRVSOCK 10 CMD 'INCLUDE MULTIPLE_REQ' JANUS START SOCKS PROCEDURE MULTIPLE_REQ Begin %r Float %sepx Float %data String Len 255 %r = %sock_set(1, 'PRSTOK', 'AMBIG|0D0A|0D|0A|07') %r = %sock_set(1, 'LINEND', '07') Repeat %r = %sock_recvprs(1, %data) If %r Le 0 Then Loop End End If If %data Eq 'DONE' Then Loop End Elseif %data Eq 'ACTION1' Then Repeat %r = %sock_recvprs(1, %data, %sepx) If %r Le 0 Then Stop Elseif %sepx Eq 4 Then If %data Eq 'DONE' Then %r = %sock_sendln(1, 'BAD') Loop End End If %r = %sock_sendln(1, 'GOOD') Loop End End ... processing a line of data End Repeat Else If %data Eq ... another operation Repeat %r = %sock_recvprs(1, %data, %sepx) ... (same structure as above) End Repeat Else If %data Eq ... other operations ... End If End Repeat %r = %sock_close(1) End END PROCEDURE

Ambiguous PRSTOK strings

In the example code shown in the preceding sections, the lines of data are sent with either CRLF, CR, or LF (0D0A|0D|0A) separating them, but the line that signals the end of a request is terminated with hex 07. The separators 0D0A and 0D are ambiguous because 0D is a prefix of 0D0A. When a 0D is received, Janus Sockets waits until it receives an additional byte to distinguish which of the two separators was sent. The unambiguous 07 is used in the example in "Multiple request client code" to end the GO line, because using an ambiguous separator when going from send to receive mode can cause both sides to wait for data that is not forthcoming.

In a request-response protocol, when one side finishes sending (the client finishes a request, or the server finishes a response), it then does a receive (the client to get the response, the server to get the next request). When either side does the receive, it must be ensured that the other side does not continue to wait for data on its own receive. Otherwise, both sides are receiving and nothing is being sent for them to receive. In this case, they could wait forever, until the TIMEOUT value expires at one side or the other, or until (using Model 204) one side is bumped.

If the server accepts 0D0A|0D|0A (or even just 0D0A|0D) as the terminator of the GO line, and the client sends 0D as its LINEND and then does a receive, the server will wait to receive a byte after the 0D to determine whether the separator was 0D0A or just 0D. This situation where both sides wait forever is avoided by using an unambiguous separator (07 in this case) as the terminator of the GO line. Similarly, the multiple request server uses the unambiguous 07 to terminate the response, so that the client can send the next request.

If possible, use unambiguous separators when designing your protocols. If that is very inconvenient, such as the case of copying text files with the ambiguous linend 0D0A|0D|0A strings, you must use the AMBIG| specification at the start of your PRSTOK string, and you should take special care when composing the terminator of a line that is the last one before a switch to do a receive. The AMBIG| indicator highlights the necessity to ensure that your protocol, that is, the sockets application processing logic, avoids the pitfalls introduced by ambiguous separators.

Since 0A and 0D0A are not prefixes of another separator, either could be used as GO terminator, if the line GO is guaranteed not to occur as a data line. The server code (for example, in "Single request server code") uses the extra protection of a unique line and a unique separator to double-check that the client is not violating the protocol.

Again, the "wait forever" problem that ambiguous separators may produce only applies to a separator that occurs at a send/receive boundary. Thus, as mentioned in "Retrieving web pages: HTTP, HTML", a single request-response protocol presents no problems if the only ambiguity is in the response stream.

Other than "wait forever" problems, the other question raised by ambiguous separators concerns null "lines." The simple rule is that the longest separator possible is chosen. So, consider the following somewhat silly client code that "hand codes" its separators:

%r = $sock_send(%sok, 'Line 1') %r = $sock_send(%sok, $x2c('0D'), 'BINARY') %line2 = %r = $sock_send(%sok, %line2) %r = $sock_send(%sok, $X2C('0A'), 'BINARY') %r = $sock_send(%sok, 'Line 3') %r = $sock_send(%sok, $x2c('0D07'), 'BINARY')

If the corresponding server code is:

%r = $sock_set(1, 'PRSTOK', 'AMBIG|0D0A|0D|0A|07') Repeat %r = $sock_recv(1, %targ) If %r Le 0 Then Loop End End Print 'Length ' $Len(%targ) ': ' %targ

The results will be:

Length 6: Line 1 Length 6: Line 3 Length 0:

What might appear to be the sending of three lines by the client, with the second line null, is actually seen by the server as two lines, because the hexadecimal string 0D0A is seen as one separator, not two.

Print capturing example

This example shows the use of print capturing. This application uses WRITE IMAGE ON TERMINAL for output, which might be appropriate, for example, in a BATCH2 input stream.

Note that the final Print statement goes to the terminal, since there are no sockets with print captured at that point.

For more information about print capturing, see "Print capturing hierarchy and other considerations".

Begin IMAGE LIN S IS STRING LEN 255 END IMAGE OPEN TERMINAL FOR OUTPUT %sock = $sock_conn('SOCKEM') If %sock Lt 0 Then Stop End If %s = $sock_capture(%sock, 'ON') Prepare Image LIN Print 'Hello, world!' %lin:S = 'Look at me - I sent a line!' Write Image LIN On Terminal Print 'Goodbye, world!' %x = $sock_close Print 'Mission accomplished' End

See also