Longstrings: Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (Created page with "<!-- Longstrings --> The <tt>Longstring</tt> datatype was introduced in ''Sirius Mods'' version 6.2. Longstrings appear as a native ''Model 204'' datatype and are defined in the ...")
 
m (typo)
 
(17 intermediate revisions by 5 users not shown)
Line 1: Line 1:
<!-- Longstrings -->
<!-- <var>Longstrings</var> -->
The <tt>Longstring</tt> datatype was introduced in ''Sirius Mods'' version 6.2.
As of <var class="product">Model 204</var> version 7.5, <var>Longstrings</var> appear as a native <var class="product">Model 204</var> datatype and are defined in the same way as other variable datatypes:
Longstrings appear as a native ''Model 204'' datatype and are defined
<p class="code">%name  is longstring
in the same way as other variable datatypes:
</p>
<pre style="xmp">
<var>Longstring</var> variables are largely interchangeable with <var>String</var> variables, with the exception that a <var>Longstring</var> can have a length up to 2**31-1 bytes, while <var>String</var> variables have a maximum length of 255 bytes. The <var>Variables Are</var> statement and the <var>VTYPE</var> parameter do not allow <var>Longstring</var> to be set as a default type, so all <var>Longstring</var> variables must be explicitly declared as such. <var>Longstring</var> variables can be defined as <var>Common</var> and as subroutine parameters, but there is currently no support for <var>Static</var> <var>Longstring</var> variables. <var>Longstrings</var> may be specified in an <var>Initial</var> clause.
    %name  is longstring
</pre>
Longstrings are largely interchangeable with String variables,
with the exception that Longstrings can have a length up to
2**31-1 bytes while String variables have a maximum length of
255 bytes.
The <tt>Variables Are</tt> statement and the <tt>VTYPE</tt> parameter do
not allow <tt>Longstring</tt> to be set as a default type, so all Longstring
variables must be explicitly declared as such.
Longstring variables can be defined as Common and as subroutine
parameters, but there is currently no support for Static longstring
variables.
''Sirius Mods'' version 7.2 introduced support for the Initial clause for longstrings.
   
   
Like other %variables, Longstrings cannot be declared as <tt>Global</tt>
Like other %variables, a <var>Longstring</var> cannot be declared as <var>Global</var> on its declaration. However, a <var>Longstring</var> %variable can be dynamically bound to a global <var>Longstring</var> with the <var>[[$Lstr_Global_and_$Lstr_Session|$Lstr_global]]</var> function, and it can be dynamically bound to a session global <var>Longstring</var> with the <var>[[$Lstr_Global_and_$Lstr_Session|$Lstr_session]]</var> function.
on their declarations.
However, as of ''Sirius Mods'' version 6.3, a Longstring %variable can be dynamically
bound to a global longstring with the $lstr_global function, and
it can be dynamically bound to a
session global Longstring with the $lstr_session
function ([[??]] refid=lstglb.).
   
   
The value of a global or session Longstring can also be retrieved
The value of a global or session <var>Longstring</var> can also be retrieved with <var>[[$Lstr_Global_Get_and_$Lstr_Session_Get|$Lstr_global_get]]</var> or <var>[[$Lstr_Global_Get_and_$Lstr_Session_Get|$Lstr_session_get]]</var>, and it can be updated with <var>[[$Lstr_Global_Set_and_$Lstr_Session_Set|$Lstr_global_set]]</var> or <var>[[$Lstr_Global_Set_and_$Lstr_Session_Set|$Lstr_session_set]]</var>.
with $lstr_global_get and $lstr_session_get, and it can be updated with
$lstr_global_set and $lstr_session_set.
   
   
Longstrings can also be declared as arrays:
<var>Longstrings</var> can also be declared as arrays:
<pre style="xmp">
<p class="code">%heaps  is longstring array(10)
    %heaps  is longstring array(10)
</p>
</pre>
   
   
The Longstring datatype is not supported inside images.
The <var>Longstring</var> datatype is not supported inside images. However, image items with length greater than 255 are now supported:
However, ''Sirius Mods'' version 7.2 introduced support for image items with
<p class="code">image foo
length greater than 255:
  bar  is string len 300
<pre style="xmp">
end image
    image foo
</p>
      bar  is string len 300
    end image
</pre>
   
   
While such image items can't have arbitrary lengths up to 2**31-1
While such image items can't have arbitrary lengths up to 2**31-1 like other <var>Longstring</var> variables, they exhibit the same behavior as other <var>Longstring</var> variables in request cancellation in the case of truncation, and in upgrading <var>With</var> operations to <var>Longstring</var> <var>With</var> operations.
like other longstring variables, they exhibit the same behavior as
other longstring variables in request cancellation in the case of
While it might be tempting to redefine many or all <code>String Len 255</code> variables as <var>Longstring</var>, there are a few subtle issues discussed in this chapter that might result in problems should this be done.  This is not to say that many such variables shouldn't be converted to <var>Longstring</var>, but it might not be as simple as a one-line editing change.
truncation and in upgrading With operations to longstring With
operations.
   
   
While it might be tempting to redefine many or all String Len 255
variables as Longstring, there are a few subtle issues discussed
in this chapter that might result in problems should this be done.
This is not to say that many such variables shouldn't be converted
to Longstring, but it might not be as simple as a one-line editing change.
==Truncation==
==Truncation==
One key difference between Longstrings and regular Strings
One key difference between a <var>Longstring</var> and a regular <var>String</var> is the default behavior of <var>Longstring</var> truncation: <b>any truncation on assignment from a Longstring, Longstring $function, or Longstring With operation causes request cancellation</b>. Two examples of the application of this rule follow:
is the default behavior of Longstring truncation:
any truncation on assignment from a Longstring, Longstring $function,
or longstring With operation causes request cancellation.
Two examples of the application of this rule follow:
<ul>
<ul>
<li>An assignment to a String variable from a Longstring
<li>An assignment to a <var>String</var> variable from a <var>Longstring</var> results in request cancellation if the value of the <var>Longstring</var> exceeds the declared <var>String</var> length. This cancellation can happen even if the <var>Longstring</var> is less than 255 bytes longIf, say, variable <code>%short</code> were defined as <code>String Len 55</code>, and a <var>Longstring</var> variable called <code>%long</code> contained 60 bytes of data, an assignment like the following results in request cancellation:
results in request cancellation if the value of the Longstring
<p class="code">%short = %long
exceeds the declared String length.
</p>
This cancellation can happen even if the Longstring
Yet, you can successfully use an intermediate assignment to a <code>String Len 255</code> variable (called <code>%medium</code> in the following example) followed by the assignment of that variable to <code>%short</code>:
is less than 255 bytes long:
<p class="code">%medium = %long
If, say, variable %short were defined as String Len 55,
%short  = %medium
and a Longstring variable called %long contained 60 bytes of data,
</p>
an assignment like the following results in request cancellation:
As a result, the last five bytes of the value originally held in <code>%long</code> are silently truncated and assigned to <code>%short</code>.
<pre style="xmp">
    %short = %long
</pre>
Yet, you can successfully use an intermediate
assignment to a String Len 255 variable (called %medium in the following
example) followed by the assignment of that variable to %short:
<pre style="xmp">
    %medium = %long
    %short  = %medium
</pre>
As a result,
the last five bytes of the value originally held in %long are
silently truncated and assigned to %short.
   
   
Of course, since regular Strings can never be longer than 255 bytes,
Of course, since a regular <var>String</var> can never be longer than 255 bytes, any assignment from a <var>Longstring</var> longer than 255 bytes to a regular <var>String</var> will result in request cancellation. There are several ways around this problem, but the simplest is to use the <var>[[$Str|$Str]]</var> function to silently truncate a <var>Longstring</var> at 255 bytes or whatever is required for assignment to its target. Effectively, the <var>$str</var> function tells <var class="product">Model 204</var> to treat the <var>Longstring</var> as it would a regular <var>String</var> for truncation purposes, and the assignment succeeds:
any assignment from a Longstring longer than 255 bytes to a regular
<p class="code">%short = $str(%long)
String will result in request cancellation.
</p>
There are several ways around this &ldquo;problem&rdquo;, but the simplest
<li>Although the <var>Longstring</var> datatype is not supported inside images, you can assign from a <var>Longstring</var> to an image item. However, assigning to an image item a <var>Longstring</var> variable that has a value that
is to use the $str function
ends with one or more of the target image item's <var>Pad</var> character (which defaults to the space character) where the target image item is not <var>NoStrip</var> results in an implicit truncation &mdash; the trailing pad characters are effectively removed. Since implicit truncation of a <var>Longstring</var> value on assignment is not allowed, this results in request cancellation.
([[??]] reftxt=$STR refid=st.)
to silently truncate a Longstring at 255 bytes
or whatever is required for assignment to its target.
Effectively, the $str function tells ''Model 204'' to treat the Longstring as it would
a regular String for truncation purposes, and the assignment succeeds:
<pre style="xmp">
    %short = $str(%long)
</pre>
<li>Although the Longstring datatype is not supported inside images,
you can assign from a Longstring to an image item.
However, assigning to an image item a Longstring variable that has a value that
ends with one or more of the target image item's Pad character (which defaults
to the space character) where the target image item is not NoStrip results
in an implicit truncation &mdash; the trailing pad characters are effectively
removed.
Since implicit truncation of a Longstring value on assignment is not
allowed, this results in request cancellation.
   
   
For example, the following request, which prints the result
For example, the following request, which prints the result <code>They're different</code>, shows the image item truncation for an assignment from a <var>String</var>:
<tt>They're different</tt>, shows the image item truncation
<p class="code">begin
for an assignment from a String:
  %str    is string len 8
<pre style="xmp">
    b
    %str    is string len 8
   
   
    image foo
  image foo
      x  is string len 8
      x  is string len 8
    end image
  end image
   
   
    prepare image foo
  prepare image foo
    %str = 'Blank '
  %str = 'Blank '
    %foo:x = %str
  %foo:x = %str
   
   
    if %foo:x ne %str then
  if %foo:x ne %str then
      print 'They''re different'
      print 'They<nowiki>''</nowiki>re different'
    end if
  end if
    end
end
</pre>
</p>
   
   
If <tt>%str</tt> is declared as a Longstring above, however, the request
If <code>%str</code> is declared as a <var>Longstring</var> above, however, the request is cancelled by a <var>Longstring</var> truncation error. But if <code>%str</code> is declared as a <var>Longstring</var>, and if <code>%foo:x = %str</code> is replaced by <code>%foo:x = $str(%str)</code>, the request succeeds.
is cancelled by a Longstring truncation error.
But if <tt>%str</tt> is declared as a Longstring, and
if <tt>%foo:x = %str</tt> is replaced by <tt>%foo:x = $str(%str)</tt>,
the request succeeds.
</ul>
</ul>
   
   
Using $str to correct for this Longstring truncation behavior is not
Using <var>$str</var> to correct for this <var>Longstring</var> truncation behavior is not always appropriate, though. The use of <var>$str</var> might be viewed as a continuation of the dubious <var class="product">Model 204</var> programming practice of truncation by assignment, so it might be avoided or at least used as a last result as a matter of policy. In fact, converting many <var>String</var> variables to <var>Longstring</var> might be viewed as a way of detecting possible unintentional truncation in existing applications, although there are some subtle issues one should be aware of before embarking on such an enterprise.
always appropriate, though.
The use of $str might be viewed as a continuation of the
For additional discussion of these truncation issues, see [[Longstrings#Changing_Longstring_truncation_behavior|Changing Longstring truncation behavior]].
dubious ''Model 204'' programming practice of truncation by assignment, so it
might be avoided or at least used as a last result as a matter of policy.
In fact, converting many String variables to Longstring might be viewed as
a way of detecting possible unintentional truncation in existing applications,
although there are some subtle issues one should be aware of before
embarking on such an enterprise.
   
   
For additional discussion of these truncation issues, see [[#Changing Longstring truncation behavior|Changing Longstring truncation behavior]].
==Longstrings in expressions==
==Longstrings in expressions==
Like Strings, a Longstring variable can be used in User Language expressions,
Like <var>Strings</var>, a <var>Longstring</var> variable can be used in <var class="product">SOUL</var> expressions, as operands or as input to $functions. <var>Longstring</var> variables can also be used as input to intrinsic methods (as can any other string or numeric datatype).
as operands or as input to $functions.
 
In ''Sirius Mods'' version 7.2 and later, Longstring variables can also be used as
One important point to keep in mind is that <var class="product">Model 204</var>'s expression processing behavior is not changed at all unless <var>Longstring</var> variables or $functions are used, and then only changed in the statements where they are actually used. So the effect of any use of <var>Longstring</var> variables or $functions is limited to the statements that use them.
input to intrinsic methods (as can any other string or numeric datatype).
 
   
===Concatenation: the With operator===
User Language expressions can have embedded sub-expressions or simply expressions.
 
For example, in
<var class="product">SOUL</var> expressions can have embedded sub-expressions or simply expressions. For example, in
<pre style="xmp">
<p class="code">%x = %a with %b with %c
    %x = %a with %b with %c
</p>
</pre>
the expression <code>%a with %b</code> is evaluated and an intermediate result is produced. This intermediate result is then used as the first operand in a <var>With</var> operation <code>with %c</code>. With no <var>Longstrings</var> involved, string expressions are silently truncated at 255 bytes, including when producing an intermediate result. So, in the above example, if <code>%a</code> and <code>%b</code> were each 200 bytes long, the intermediate result of <code>%a with %b</code> would be truncated at the 55th byte of <code>%b</code>, and the <code>with %c</code> would simply drop <code>%c</code>, since the intermediate result that was the first operand of the <code>with %c</code> would already be 255 bytes long. In this case, <code>%x</code> would end up containing all of <code>%a</code>, the first 55 bytes of <code>%b</code>, and none of <code>%c</code>. Fortunately, the results would be the same even if the expression were written as follows:
the expression <tt>%a with %b</tt> is evaluated and an intermediate result is produced.
<p class="code">%x = %a with (%b with %c)
This intermediate result is then used as the first operand in a With operation
</p>
with %c.
It is still worth working this out mentally to develop a good feel for how intermediate expression results are processed in <var class="product">SOUL</var>.
With no Longstrings involved, string expressions are silently truncated at
255 bytes, including when producing an intermediate result.
So, in the above example, if %a and %b were each 200 bytes long, the intermediate
result of <tt>%a with %b</tt> would be truncated at the 55th byte of %b and the
<tt>with %c</tt> would simply drop %c since the intermediate result that was the
first operand of the <tt>with %c</tt> would already be 255 bytes long.
In this case, %x would end up containing all of %a, the first 55 bytes of %b
and none of %c.
Fortunately, the results would be the same even if the expression were written
<pre style="xmp">
    %x = %a with (%b with %c)
</pre>
though it's worth working this out mentally to develop a good feel for how
intermediate expression results are processed in User Language.
In any case, the WITH operation behaves differently in the presence of
Longstrings.
Specifically, if either operand of a With operation is a Longstring,
the intermediate result of the operation is also a Longstring.
If, in the above example, %a was a Longstring and %b and %c were regular
Strings, the result of <tt>%a with %b</tt> would be a 400-byte Longstring.
When this 400-byte intermediate result Longstring is then concatenated
using the With operation on %c, the result will be a Longstring of length
400 plus the length of %c.
If the target of this expression, %x, was a regular String, this would cause
a request-cancelling truncation error.
In addition, if the target of a With operation is a Longstring, the With
operation produces a Longstring result, even if none of the operands are
themselves Longstrings.
For example, if
%x is a Longstring, and %a and %b are String Len 255, each with 200 bytes of
data:
<pre style="xmp">
    %x = %a with %b
</pre>
%x will be 400 bytes long, containing all of %a concatenated
with all of %b.
If either of the operands of such a With clause is itself an expression, that
expression is treated as if its target were also a Longstring.
For example, if in
<pre style="xmp">
    %x = %a with (%b with %c)
</pre>
%x is a Longstring, and %a, %b, and %c are String Len 255, each with 200 bytes of
data, %x will end up being 600 bytes long, containing all of %a concatenated
with all of %b with all of %c.
This works the same way if the assignment is written as either of the following:
<pre style="xmp">
    %x = (%a with %b) with %c
    %x = %a with %b with %c
</pre>
Expression processing is the same for string literals, so if %x is a Longstring,
and %a is a String Len 255 with 255 bytes of data,
the following assigns 258 bytes to %x:
<pre style="xmp">
    %x = %a with '...'
</pre>
   
   
Another way of looking at this is that in the presence of Longstring variables,
In any case, the <var>With</var> operation behaves differently in the presence of <var>Longstrings</var>. Specifically, if either operand of a <var>With</var> operation is a <var>Longstring</var>, the intermediate result of the operation is also a <var>Longstring</var>. If, in the above example, <code>%a</code> is a <var>Longstring</var> and <code>%b</code> and <code>%c</code> are regular <var>Strings</var>, the result of <code>%a with %b</code> is a 400-byte <var>Longstring</var>. When this 400-byte intermediate result <var>Longstring</var> is then concatenated using the <var>With</var> operation on <code>%c</code>, the result is a <var>Longstring</var> of length 400 plus the length of <code>%c</code>. If the target of this expression, <code>%x</code>, is a regular <var>String</var>, this causes a request-cancelling truncation error.
whether as the target or as one of the operands, all concatenation operations are
&ldquo;upgraded&rdquo; to be Longstring concatenations.
One side-effect of this is that if an operand of a concatenation is a Longstring,
Longstring truncation rules apply to the ultimate target of the assignment.
For example, %long is a Longstring containing <tt>'Testing...'</tt>, and
%short is a String Len 12:
<pre style="xmp">
    %short = (%long with '123') with '456'
</pre>
The result is a request-cancelling truncation error, because the result
of all the concatenation operations is treated as a Longstring, albeit
one with less than 255 bytes of data.
The cancellation can be avoided with the use of the $str function,
as in the following:
<pre style="xmp">
    %short = $str((%long with '123') with '456')
</pre>
Though, again, this is simply carrying on the dubious User Language programming
practice of truncation by assignment.
   
   
Note that the &ldquo;upgrading&rdquo; of With operations to longstring With operations
In addition, if the target of a <var>With</var> operation is a <var>Longstring</var>, the <var>With</var> operation produces a <var>Longstring</var> result, even if none of the operands are themselves <var>Longstrings</var>. For example, if <code>%x</code> is a <var>Longstring</var>, and <code>%a</code> and <code>%b</code> are <code>String Len 255</code>, each with 200 bytes of data:
is not induced by a Longstring variable or expression inside a $function call.
<p class="code">%x = %a with %b
For example, %long is a Longstring with 30 bytes of data, and %short is
</p>
String Len 10:
<code>%x</code> will be 400 bytes long, containing all of <code>%a</code> concatenated with all of <code>%b</code>. If either of the operands of such a <var>With</var> clause is itself an expression, that expression is treated as if its target were also a <var>Longstring</var>. For example, if in
<pre style="xmp">
<p class="code">%x = %a with (%b with %c)
    %short = '*' with $substr(%long, 1, 20)
</p>
</pre>
<code>%x</code> is a <var>Longstring</var>, and <code>%a</code>, <code>%b</code>, and <code>%c</code> are <code>String Len 255</code>, each with 200 bytes of data, <code>%x</code> will end up being 600 bytes long, containing all of <code>%a</code> concatenated with all of <code>%b</code> with all of <code>%c</code>. This works the same way if the assignment is written as either of the following:
%short ends up containing an asterisk followed by the first 9 bytes of
<p class="code">%x = (%a with %b) with %c
%long.
%x = %a with %b with %c
The assignment is made with silent truncation,
</p>
because the result of a non-longstring-capable $function is always
treated as a regular String for the purposes of assignment and With processing.
   
   
In a context where a Longstring is automatically converted to a numeric
Expression processing is the same for string literals, so if <code>%x</code> is a <var>Longstring</var>, and <code>%a</code> is a <code>String Len 255</code> with 255 bytes of data, the following assigns 258 bytes to <code>%x</code>:
datatype, a request-cancelling truncation error occurs if the Longstring
<p class="code">%x = %a with '...'
variable is longer than 255 bytes, even if most or all of these bytes are leading
</p>
zeros.
For example, %long is a Longstring with 300 zeros followed by a one:
<pre style="xmp">
    %a = %long + 1
</pre>
The result is a request-cancelling truncation error.
Fortunately, it's not likely that one is likely to encounter numbers with greater
than 255 digits in them.
Longstring data used in a numeric context will undergo the dubious automatic
conversion of invalid numeric data into a zero in the same way as String data.
'''Note:'''
One case of automatic conversion to numeric where
String and Longstring behaviors differ
is index loop control variables.
For example, as of ''Sirius Mods'' version 6.7, the following loop is valid
if <tt>%s</tt> is a String,
but it results in a compilation error if <tt>%s</tt> is a Longstring:
<pre style="xmp">
    for %s from 1 to 2
      print %s
    end for
</pre>
   
   
If the result of a numeric operation on a Longstring is then used in a With
Another way of looking at this is that in the presence of <var>Longstring</var> variables, whether as the target or as one of the operands, all concatenation operations are "upgraded" to be <var>Longstring</var> concatenations.  One side-effect of this is that if an operand of a concatenation is a <var>Longstring</var>, <var>Longstring</var> truncation rules apply to the ultimate target of the assignment. For example, <code>%long</code> is a <var>Longstring</var> containing <code>'Testing...'</code>, and <code>%short</code> is a <code>String Len 12</code>:
operation, the With operation is not upgraded to a longstring With operation,
<p class="code">%short = (%long with '123') with '456'
because the intermediate result of the numeric operation is not a Longstring
</p>
but a numeric, which is then automatically converted to a String
The result is a request-cancelling truncation error, because the result of all the concatenation operations is treated as a <var>Longstring</var>, albeit one with less than 255 bytes of data. The cancellation can be avoided with the use of the <var>$Str</var> function, as in the following:
intermediate result.
<p class="code">%short = $str((%long with '123') with '456')
For example, %long is a Longstring containing <tt>99</tt> and %short
</p>
is String Len 2:
Though, again, this is simply carrying on the dubious <var class="product">SOUL</var> programming practice of truncation by assignment.
<pre style="xmp">
    %short = %long + 1
</pre>
The result is not a request cancellation; instead, a
<tt>M204.0552: VARIABLE TOO SMALL FOR RESULT</tt> message is issued,
and an asterisk ( * ) is assigned to %short.
Similarly, with these definitions and values
<pre style="xmp">
    %short = (%long + 1) with '*'
</pre>
results in a <tt>10</tt> being assigned to %short with no warnings, exactly
the behavior if %long were a String Len 255.
   
   
Comparison operations such as Eq, Lt, Le, >, <, etc. will perform longstring
Note that the "upgrading" of <var>With</var> operations to <var>Longstring</var> <var>With</var> operations is not induced by a <var>Longstring</var> variable or expression inside a $function call.  For example, <code>%long</code> is a <var>Longstring</var> with 30 bytes of data, and <code>%short</code> is <code>String Len 10</code>:
comparisons if either of the operands is a Longstring, that is, comparison
<p class="code">%short = '*' with $substr(%long, 1, 20)
operations involving Longstring operands behave pretty much &ldquo;as expected.&rdquo;
</p>
<code>%short</code> ends up containing an asterisk followed by the first 9 bytes of <code>%long</code>. The assignment is made with silent truncation, because the result of a non-longstring-capable $function is always treated as a regular <var>String</var> for the purposes of assignment and <var>With</var> processing.
 
===Numeric conversion===
In a context where a <var>Longstring</var> is automatically converted to a numeric datatype, a request-cancelling truncation error occurs if the <var>Longstring</var> variable is longer than 255 bytes, even if most or all of these bytes are leading zeros.  For example, <code>%long</code> is a <var>Longstring</var> with 300 zeros followed by a one:
<p class="code">%a = %long + 1
</p>
The result is a request-cancelling truncation error. Fortunately, it's not likely that one is likely to encounter numbers with greater than 255 digits in them. <var>Longstring</var> data used in a numeric context will undergo the dubious automatic conversion of invalid numeric data into a zero in the same way as <var>String</var> data.
   
   
One important point to keep in mind is that ''Model 204'''s expression processing
If the result of a numeric operation on a <var>Longstring</var> is then used in a <var>With</var> operation, the <var>With</var> operation is not upgraded to a <var>Longstring</var> <var>With</var> operation, because the intermediate result of the numeric operation is not a <var>Longstring</var> but a numeric, which is then automatically converted to a <var>String</var> intermediate result.  For example, <code>%long</code> is a <var>Longstring</var> containing <code>99</code>, and <code>%short</code> is <code>String Len 2</code>:
behavior is not changed at all unless Longstring variables or $functions are
<p class="code">%short = %long + 1
used, and then only changed in the statements where they are actually used.
</p>
So there is no backward compatibility problem in installing ''Sirius Mods'' version 6.2
The result is not a request cancellation; instead, a <code>M204.0552: VARIABLE TOO SMALL FOR RESULT</code> message is issued, and an asterisk ( * ) is assigned to <code>%short</code>. Similarly, with these definitions and values:
or later, and the effect of any use of Longstring variables or $functions will
<p class="code">%short = (%long + 1) with '*'
be limited to the statements that use them.
</p>
The result is a <code>10</code> being assigned to <code>%short</code> with no warnings, exactly the behavior if <code>%long</code> were a <code>String Len 255</code>.
 
<div id="longstrInvIndex"></div>
====Longstrings not allowed as index %variable in For statement====
<!--Caution: <div> above-->
 
One case of automatic conversion to numeric where <var>String</var> and <var>Longstring</var> behaviors differ is index loop control variables.  For example, the following loop is valid if <code>%s</code> is a <var>String</var>, but it results in a compilation error if <code>%s</code> is a <var>Longstring</var>:
<p class="code">for %s from 1 to 2
  print %s
end for
</p>
 
===Comparisons===
Comparison operations such as <var>Eq</var>, <var>Lt</var>, <var>Le</var>, <var>></var>, <var><</var>, etc. will perform <var>Longstring</var> comparisons if either of the operands is a <var>Longstring</var>, that is, comparison operations involving <var>Longstring</var> operands behave pretty much as expected.
 
==Longstrings and $functions==
==Longstrings and $functions==
Longstrings can be used as inputs to $functions.
<var>Longstrings</var> can be used as inputs to $functions. As mentioned before, if a <var>Longstring</var> expression is assigned to a regular <var>String</var>, a request-cancelling truncation error will occur if the target <var>String</var> variable is not big enough to hold the source <var>Longstring</var>. Request-cancelling truncation errors also occur if a <var>Longstring</var> that is longer than 255 bytes is passed to a non-<var>Longstring</var>-capable $function. For example:
As mentioned before, if a Longstring expression is assigned
<p class="code">print $substr(%long, 1, 50)
to a regular String, a request-cancelling truncation error will occur if the
</p>
target String variable is not big enough to hold the source Longstring.
would result in request cancellation if <code>%long</code> was longer than 255 bytes. One way around this would be to use the <var>$str</var> function to tell <var class="product">SOUL</var> to treat the <var>Longstring</var> as a <var>String</var> in this case as in:
Request-cancelling truncation errors also occur if a Longstring that is longer
<p class="code">print $substr($str(%long), 1, 50)
than 255 bytes is passed to a non-Longstring-capable $function.
</p>
For example,
though a better approach in this case would be to use the <var>Longstring</var>-capable sub-stringing function, <var>[[$Lstr_Substr]]</var>, as in:
<pre style="xmp">
<p class="code">print $lstr_substr(%long, 1, 50)
    print $substr(%long, 1, 50)
</p>
</pre>
would result in request cancellation if %long was longer than 255 bytes.
One way around this would be to use the $str function to tell User Language
to treat the Longstring as a String in this case as in
<pre style="xmp">
    print $substr($str(%long), 1, 50)
</pre>
though a better approach in this case would be to use the Longstring-capable
sub-stringing function, $lstr_substr, as in:
<pre style="xmp">
    print $lstr_substr(%long, 1, 50)
</pre>
   
   
The Longstring-capable $functions in this manual typically start with
The <var>Longstring</var>-capable $functions in this manual typically start with "$lstr", end in "_lstr" (such as <var>[[$ListInf_Lstr]]</var>), or belong to a family of $functions (such as the $Regex family) that are completely <var>Longstring</var>-capable. <var>Longstring</var>-capable $functions specific to other Sirius products (like the [[Janus Web Server]] and [[Janus Sockets]] $functions) typically do not use an "lstr" prefix or suffix, but they are identified in their documentation as <var>Longstring</var>-capable.
&ldquo;$lstr&rdquo;, end in &ldquo;_lstr&rdquo; (such as $listinf_lstr), or belong
to a family of $functions (such as
the $regex family) that are completely longstring-capable.
Longstring-capable $functions specific to other Sirius products (like the [[Janus Web Server]]
and [[Janus Sockets]] $functions) typically do not use an &ldquo;lstr&rdquo; prefix or suffix,
but they are identified in their documentation as longstring-capable.
   
   
In addition to their ability to process more than 255-byte long strings,
In addition to their ability to process more than 255-byte long strings, <var>Longstring</var>-capable $functions have some special characteristics pertaining to expression handling:
longstring-capable $functions have some special characteristics pertaining to
expression handling:
<ul>
<ul>
<li>A longstring-capable $function that returns a string result (as opposed
<li>A <var>Longstring</var>-capable $function that returns a string result (as opposed to one that returns a numeric result such as <var>[[$Lstr_Index]]</var>) is treated as a <var>Longstring</var> expression for the purposes of truncation and for the upgrading of <var>With</var> operations to <var>Longstring</var> <var>With</var> operations. For example, if <code>%short</code> is <code>String Len 5</code> and <code>%junk</code> contains <code>Some text</code>:
to one that returns a numeric result such as $lstr_index) is treated as a Longstring
<p class="code">%short = $lstr_substr(%junk, 1, 7)
expression for the purposes of truncation and for the upgrading of With operations
</p>
to longstring With operations.
would result in a request-cancelling truncation error. This is true whether <code>%junk</code> was a <var>Longstring</var> or a regular <var>String</var>, though the latter illustrates the point that regular <var>String</var> variables (or expressions) can be used as input to <var>Longstring</var>-capable $functions. If <code>%junk</code> contained 300 bytes of data:
For example if %short is String Len 5 and %junk contains <tt>Some text</tt>,
<p class="code">%out = $lstr_substr(%junk, 1, 255) with '*'
<pre style="xmp">
</p>
    %short = $lstr_substr(%junk, 1, 7)
would result in a request-cancelling truncation error if <code>%out</code> were a regular <var>String</var> variable, and would result in 256 bytes, the last byte being an asterisk, being assigned to <code>%out</code> if <code>%out</code> were a <var>Longstring</var>.
</pre>
would result in a request-cancelling truncation error.
<li>All string arguments to <var>Longstring</var>-capable $functions are treated as <var>Longstring</var> targets for the purpose of upgrading <var>With</var> operations to <var>Longstring</var> <var>With</var> operations. For example, since <var>[[$Lstr_Right]]</var> is <var>Longstring</var>-capable, <var>With</var> in its string argument is upgraded to <var>Longstring</var>. So, if <code>%medium</code> is a string containing 252 or more characters, then:
This is true whether %JUNK was a Longstring or a regular String, though
<p class="code">$lstr_right(%medium with '****', 256)
the latter illustrates the point that regular String variables (or
</p>
expressions) can be used as input to Longstring-capable $functions.
returns the right-most 252 bytes of <code>%medium</code>, concatenated with four asterisks.
If %junk contained 300 bytes of data
<pre style="xmp">
    %out = $lstr_substr(%junk, 1, 255) with '*'
</pre>
would result in a request-cancelling truncation error if %out were a regular
String variable, and would result in 256 bytes, the last byte
being an asterisk, being assigned to %out if %out were a Longstring.
<li>All string arguments to longstring-capable $functions are
treated as longstring targets for the purpose of upgrading With operations
to longstring With operations.
For example, since $lstr_right is longstring-capable, With in its string
argument is upgraded to longstring.
So, if %medium is a string containing 252 or more characters, then
<pre style="xmp">
    $lstr_right(%medium with '****', 256)
</pre>
returns the right-most 252 bytes of %medium,
concatenated with four asterisks.
   
   
Note that this behavior does not imply that longstring-capable $functions
<p class="note"><b>Note:</b> This behavior does not imply that <var>Longstring</var>-capable $functions will always accept strings longer than 255 bytes as their arguments. For example, <var>$Lstr_Index</var> will not accept strings longer than 255 bytes as its second argument (the string being searched for), and <var>$Lstr_Right</var> and <var>[[$Lstr_Left]]</var> won't accept any strings longer than a single byte for their third argument (the pad character). This $function-specific behavior does not affect the treatment of the $function results or arguments as <var>Longstring</var> data for expression handling purposes. </p>
will always accept strings longer than 255 bytes as their arguments.
For
example, $lstr_index will not accept strings longer than 255 bytes as its
second argument (the string being searched for), and $lstr_right and $lstr_left
won't accept any strings longer than a single byte for their third argument
(the pad character).
This $function-specific behavior does not affect the treatment of the
$function results or arguments as longstring data for expression
handling purposes.
</ul>
</ul>
==Longstrings and complex subroutines==
==Longstrings and complex subroutines==
Complex subroutine parameters, both Input and Output (or Inout, which
Complex subroutine parameters, both <var>Input</var> and <var>Output</var> (or <var>InOut</var>, which means the same thing as <var>Output</var>) can be defined as <var>Longstring</var>, as in either of the following:
means the same thing as Output) can be
<p class="code">subroutine chop(%x is longstring input)
defined as Longstring, as in either of the following:
<pre style="xmp">
    subroutine chop(%x is longstring input)
   
   
    subroutine chop(%x is longstring output)
subroutine chop(%x is longstring output)
</pre>
</p>
   
   
In addition, Longstring variables and expressions can be passed as parameters
In addition, <var>Longstring</var> variables and expressions can be passed as parameters to complex subroutines. For Output parameters, <var>Longstring</var> issues are fairly straightforward. There are two restrictions:
to complex subroutines.
For Output parameters, longstring issues are fairly straightforward.
There are two restrictions:
<ul>
<ul>
<li>You '''cannot''' pass a Longstring as a parameter
<li>You '''cannot''' pass a <var>Longstring</var> as a parameter to a subroutine that defines the parameter as <var>String Output</var>.
to a subroutine that defines the parameter as String Output.
<li>You '''cannot''' pass a regular <var>String</var> as a parameter to a subroutine that defines the parameter as <var>Longstring Output</var>.
<li>You '''cannot''' pass a regular String
as a parameter to a subroutine that defines the parameter as Longstring Output.
</ul>
</ul>
   
   
For Input parameters, things are somewhat more complex, because:
For Input parameters, things are somewhat more complex, because:
<ul>
<ul>
<li>Mismatches in String and Longstring datatypes are allowed between
<li>Mismatches in <var>String</var> and <var>Longstring</var> datatypes are allowed between passed value and declared parameter.
passed value and declared parameter.
<li>Input parameters can actually receive the results of expressions as their inputs.
<li>Input parameters can actually receive the results of
expressions as their inputs.
</ul>
</ul>
While for Input parameters, Strings and Longstrings may be passed interchangeably
While for Input parameters, <var>Strings</var> and <var>Longstrings</var> may be passed interchangeably as <var>Longstring</var> and <var>String</var> parameters, subroutine declaration statements (<var>Declare Subroutine</var>) must exactly match the parameter types on the actual subroutine definitions. That is, given a declaration like this:
as Longstring and String parameters, subroutine declaration statements (Declare
<p class="code">declare subroutine tender(longstring)
Subroutine) must exactly match the parameter types on the
</p>
actual subroutine definitions.
That is, given a declaration like this:
<pre style="xmp">
    declare subroutine tender(longstring)
</pre>
One cannot later specify the subroutine as
One cannot later specify the subroutine as
<pre style="xmp">
<p class="code">subroutine tender(%mercy is string len 255)
    subroutine tender(%mercy is string len 255)
</p>
</pre>
If a <var>Longstring</var> parameter is passed to a subroutine with the parameter defined as <var>String Input</var>, the request is cancelled if the <var>Longstring</var> value is longer than the length of the <var>String Input</var> parameter (as always, this will happen even if the <var>Longstring</var> value is shorter than 255 bytes).  This mimics the behavior of an assignment of a <var>Longstring</var> variable to a regular <var>String</var> variable.
   
   
If a Longstring parameter is passed to a subroutine with the parameter defined
If a <var>Longstring</var> array is passed to a subroutine with the parameter defined as a <var>String Array</var>, the request is cancelled if '''any''' element of the <var>Longstring</var> array is longer than 255 bytes, whether or not that element is ever referenced in the complex subroutine.  Outside the functionality issues raised by this limitation, it also suggests an inefficiency in passing a <var>Longstring</var> array to a <var>String</var> parameter: the inefficiency of scanning the array for values longer than 255 bytes. Because of both the functionality and efficiency issues, it is probably best to avoid passing a <var>Longstring</var> array to a <var>String</var> array parameter if at all possible.
as String Input, the request is cancelled if the Longstring value
is longer than the length of the String Input parameter (as always, this will
happen even if the Longstring value is shorter than 255 bytes).
This mimics the behavior of an assignment of a Longstring variable to a regular
String variable.
   
   
If a Longstring array is passed to a subroutine with the parameter defined as
Because a <var>String</var> variable or a literal can always fit into a <var>Longstring</var> parameter, there are no truncation or other issues associated with passing <var>String</var> variables and literals as parameters defined as <var>Longstring</var>.
a String Array, the request is cancelled if '''any''' element of the
Longstring array is longer than 255 bytes, whether or not that element is ever
referenced in the complex subroutine.
Outside the functionality issues raised by this limitation, it also suggests
an inefficiency in passing a Longstring array to a String parameter:
the inefficiency of scanning the array for values longer than 255 bytes.
Because of both the functionality and efficiency issues, it is probably best
to avoid passing a Longstring array to a String array parameter if at all
possible.
   
   
Because a String variable or a literal can always fit into a Longstring
If a call to a complex subroutine contains a <var>With</var> operation for a <var>Longstring</var> parameter, that <var>With</var> operation is &ldquo;upgraded&rdquo; to a <var>Longstring</var> <var>With</var> operation, whether or not any of the operands are themselves <var>Longstrings</var>, exactly as if the target of a <var>With</var> operation were a <var>Longstring</var> variable.  As everywhere else, a <var>With</var> operation involving a <var>Longstring</var> in a subroutine call will also be upgraded to a <var>Longstring</var> <var>With</var> operation, meaning that no truncation will occur at 255 bytes, and that if the result is longer than the length of the target <var>String</var> parameter, the request will be cancelled.
parameter, there are no truncation or other issues associated with passing
String variables and literals as parameters defined as Longstring.
   
   
If a call to a complex subroutine contains a With operation for a Longstring
parameter, that With operation is &ldquo;upgraded&rdquo; to a longstring With operation,
whether or not any of the operands are themselves Longstrings, exactly as if
the target of a With operation were a Longstring variable.
As everywhere else, a With operation involving a Longstring in a subroutine
call will also be upgraded to a longstring With operation, meaning that no
truncation will occur at 255 bytes, and that if the result is longer than
the length of the target String parameter, the request will be cancelled.
==Changing Longstring truncation behavior==
==Changing Longstring truncation behavior==
While it is sometimes convenient that ''Model 204'' silently truncates string data on
While it is sometimes convenient that <var class="product">Model 204</var> silently truncates string data on assignment to a variable or intermediate result, it has also been the source of a vast number of incorrect <var class="product">[[User Language]]</var> programs. Because of this history and the higher chance of unintentional truncation from a <var>Longstring</var> source, the default behavior for <var>Longstrings</var> is that any truncation on assignment from a <var>Longstring</var>, <var>Longstring</var> $function, or <var>Longstring</var> <var>With</var> operation causes request cancellation. This behavior should facilitate "cleaner" and more robust code &mdash; where truncation is intended, it is explicitly indicated (for example, with <var>[[$Lstr_Substr]]</var>, <var>[[$Lstr_Left]]</var>, or <var>[[$Str]]</var>).
assignment to a variable or intermediate result, it has also been the source
of a vast number of incorrect User Language programs.
Nevertheless, since this cancellation on truncation behavior is inconsistent with <var class="product">Model 204's</var> behavior for strings, it might be viewed as undesirable.  If you want to prevent request continuation on truncation of a <var>Longstring</var> source in an Online, you can <var>MSGCTL</var> the error message for <var>Longstring</var> truncation to <var>NOCAN</var>.
Because of this history and the higher chance of unintentional truncation
from a Longstring source, the default behavior for Longstrings is that any
truncation on assignment from a Longstring, longstring $function, or longstring
With operation causes request cancellation.
This behavior should facilitate &ldquo;cleaner&rdquo; and more robust code: where
truncation is intended, it is explicitly indicated (for example,
with $lstr_substr, $lstr_left, or $str).
   
   
Nevertheless, since this cancellation on truncation behavior
The three messages that you might need to MSGCTL are MSIR.0680, MSIR.0681, and MSIR.0682.
is inconsistent with ''Model 204'''s behavior for strings, it might be viewed
<ul><li>MSIR.0680 is issued if the <var>[[SIRFACT_parameter|SIRFACT]]</var> system parameter X'01' bit is set, or if the <var class="product">Model 204</var> <var>DEBUGUL</var> user parameter is set to a non-zero value.
as undesirable.
<li>MSIR.0681 is  issued for requests entered at command level rather than run from a procedure.
If you want to prevent request continuation on truncation of a Longstring source
<li>MSIR.0682 is issued otherwise.</ul>
in an Online, you can
MSGCTL the error message for Longstring truncation to NOCAN.
   
   
The three messages that you might need to MSGCTL are MSIR.0680, MSIR.0681,
Issuing <var>MSGCTL</var> for these messages to <var>NOCAN</var> might prevent request cancellation from the occasional <var>Longstring</var> truncation, but if silent truncation of <var>Longstrings</var> is heavily used as a programming &ldquo;technique&rdquo; inside a request, the user running the request will quickly be restarted with a &ldquo;TOO MANY ERRORS&rdquo; message. To prevent this, <var>MSGCTL</var> the indicated messages to <var>NOCOUNT</var>.
and MSIR.0682.
MSIR.0680 is issued if the SIRFACT system parameter X'01' bit is set, or if the
DEBUGUL user parameter is set to a non-zero value.
MSIR.0681 is issued for requests entered at command level rather than run
from a procedure.
MSIR.0682 is issued otherwise.
   
   
Issuing MSGCTL for these messages to NOCAN might prevent request cancellation from
Even then, a large number of these messages might be viewed as being annoying, at best, if the intent is to simply ignore silent truncation of <var>Longstrings</var>.  In that case, <var>MSGCTL</var> the indicated messages to <var>NOTERM</var> and maybe even <var>NOAUDIT</var> (if this latter is available).  Even then, there will be a little <var class="product">Model 204</var> processing overhead in producing the messages that are everywhere suppressed, so it would still generally be more efficient to truncate <var>Longstrings</var> explicitly using <var>[[$Str]]</var>, <var>[[$Lstr_Substr]]</var> or <var>[[$Lstr_Left]]</var>.
the occasional Longstring truncation, but if silent truncation of Longstrings
is heavily used as a programming &ldquo;technique&rdquo; inside a request, the user
running the request will quickly be restarted with a &ldquo;TOO MANY ERRORS&rdquo;
message.
To prevent this, MSGCTL the indicated messages to NOCOUNT.
   
   
Even then, a large number of these messages might be viewed as being
If you use the default <var>Longstring</var> behavior, at least in the development and test environments, you should find it will rapidly catch potential problems and so produce more bug-free code.  The request cancellation due to <var>Longstring</var> truncation should therefore be a benefit. In those places that &ldquo;truncation by assignment&rdquo; is used in the code, if you change any of the types in the source expression and discover request cancellation, you will probably decide it is better to use an explicit truncation construct, rather than to retain this dubious coding practice.
annoying, at best, if the intent is to simply ignore silent truncation of
Longstrings.
In that case, MSGCTL the indicated messages to NOTERM and
maybe even NOAUDIT (if this latter is available).
Even then, there will be a little ''Model 204'' processing overhead in producing
the messages that are everywhere suppressed, so it would still generally be
more efficient to truncate Longstrings explicitly using $str or $lstr_substr
or $lstr_left.
   
   
If you use the default Longstring behavior, at least
If there is concern about request cancellation in a production region, you can <var>MSGCTL</var> the indicated messages to <var>NOTERM</var> in production.  However, such a switch allows a production request to continue after an unanticipated <var>Longstring</var> truncation, so it could result in data corruption or a more subtle error later in the request that will cause request cancellation anyway, but be more difficult to diagnose.
in the development and test environments, you should find it will rapidly catch
potential problems and so produce more bug-free code.
The request cancellation due to longstring truncation should therefore be a
benefit.
In those places that
&ldquo;truncation by assignment&rdquo; is used in the code, if you change any of the
types in the source expression and discover request cancellation, you will
probably decide it is better to use an explicit truncation construct, rather
than to retain this dubious coding practice.
   
   
If there is concern about request cancellation in a production region, you can
MSGCTL the indicated messages to NOCAN in production.
However, such a switch
allows a production request to continue after an unanticipated longstring
truncation, so it could result in data corruption or a more subtle error later
in the request that will cause request cancellation anyway, but be more
difficult to diagnose.
==Longstrings and the Print, Html, and Text statements==
==Longstrings and the Print, Html, and Text statements==
Using Longstring expressions in the Print statement works
Using <var>Longstring</var> expressions in the <var>Print</var> statement works largely "as expected": given the constraints of <var>LOBUFF</var>, <var>OUTCCC</var>, and <var>OUTMRL</var>, and other output target specific parameters &mdash; the values of <var>Longstrings</var> are simply displayed to the output target. One minor exception to this is that the <var>To</var> clause on the <var>Print</var> statement is not supported for <var>Longstrings</var>.
largely &ldquo;as expected&rdquo;: given the constraints of LOBUFF, OUTCCC,
and OUTMRL, and other output target specific parameters, the values of
Longstrings are simply displayed to the output target.
One minor exception to this is that the To and At clauses on the PRINT
statement are not supported for Longstrings.
   
   
It should also be kept in mind that the With keyword in Print statements is not
It should also be kept in mind that the <var>With</var> keyword in <var>Print</var> statements is not the <var>With</var> concatenation operator, although the result is usually the same as if it were. Specifically, the <var>With</var> keyword results in the part before the <var>With</var> being printed, followed by the part after. This means that if two regular <var>String</var> variables, each with 255 bytes of data in them, are printed as follows:
the With concatenation operator, although the result is usually the same as if
<p class="code">print %a with %b
it were.
</p>
Specifically, the With keyword results in the part before the With being
510 bytes of data would be printed, which is different from the <var>With</var> operator in an assignment like the following, which will result in <code>%c</code> simply containing the contents of <code>%a</code>, because the <var>With</var> operation results in truncation at 255 bytes:
printed, followed by the part after.
<p class="code">%c = %a with %b
This means that if two regular String variables, each with 255 bytes of
</p>
data in them, are printed as follows:
This difference between the <var>With</var> keyword in the <var>Print</var> statement and the <var>With</var> operator in expressions predates <var>Longstrings</var> and is, in fact, more significant with regular <var>Strings</var> than with <var>Longstrings</var>.
<pre style="xmp">
    print %a with %b
</pre>
510 bytes of data would be printed, which is different from the With operator in an
assignment like the following,
which will result in %c simply containing the contents of %a,
because the With operation results in truncation at 255 bytes:
<pre style="xmp">
    %c = %a with %b
</pre>
This difference between the With keyword in the Print statement and the With operator in
expressions predates Longstrings and is, in fact, more significant with regular Strings
than with Longstrings.
   
   
The Html and Text statements allow variable values or expression results to be
The <var>[[Text_and_Html_statements#The_HTML_or_TEXT_statement|HTML]]</var> and <var>[[Text_and_Html_statements#The_HTML_or_TEXT_statement|Text]]</var> statements allow variable values or expression results to be embedded inside the expression start and end characters (defaults: { and }). As with the <var>Print</var> statement, this works pretty much &ldquo;as expected&rdquo; for <var>Longstrings</var>: the contents of the <var>Longstring</var> variable or the result of a <var>Longstring</var> expression will be displayed in their entirety within display parameter constraints. The only <var>Longstring</var> related issue for <var>[[Text_and_Html_statements#The_HTML_or_TEXT_statement|HTML]]</var> statement expressions is that if an expression is not a <var>Longstring</var> variable, a <var>Longstring</var> $function, or a <var>With</var> operation involving one or more of these, the expression is assumed to be a regular <var>String</var> expression that undergoes silent truncation at 255 bytes. For example, if <code>%a</code> and <code>%b</code> were regular <var>String</var> variables both containing 200 bytes of data, the following would truncate the concatenation of <code>%a</code> and <code>%b</code> at 255 bytes:
embedded inside the expression start and end characters (defaults: { and }).
<p class="code">text data The result is {%a with %b}
As with the Print statement, this works pretty much &ldquo;as expected&rdquo;
</p>
for Longstrings: the contents of the Longstring variable or the result
of a Longstring expression will be displayed in their entirety within display
parameter constraints.
The only Longstring related issue for Html statement expressions is that if an
expression is not a Longstring variable, a longstring $function, or a With operation
involving one or more of these, the expression is assumed to be a regular String
expression that undergoes silent truncation at 255 bytes.
For example, if %a and %b were regular String variables both containing 200 bytes
of data, the following
would truncate the concatenation of %a and %b at 255 bytes:
<pre style="xmp">
    text data The result is {%a with %b}
</pre>
   
   
To get around this, one can force the With operation to be upgraded to a Longstring
To get around this, one can force the <var>With</var> operation to be upgraded to a <var>Longstring</var> <var>With</var> operation, using:
With operation, using:
<p class="code">text data The result is {$lstr(%a with %b)}
<pre style="xmp">
</p>
    text data The result is {$lstr(%a with %b)}
</pre>
However, the use of <var>With</var> operations in <var>Html</var> statements is generally silly, since the same result can be obtained by simply entering each operand in the With expression as a separate expression as in:
<p class="code">text data The result is {%a}{%b}
</p>
   
   
However, the use of With operations in Html statements is generally silly,
since the same result can be obtained by simply entering each operand in
the With expression as a separate expression as in
<pre style="xmp">
    text data The result is {%a}{%b}
</pre>
==Longstrings and methods==
==Longstrings and methods==
In addition to their use as local variables and as inputs to or outputs
In addition to their use as local variables and as inputs to or outputs from $functions and complex subroutines, <var>Longstrings</var> can, of course, also be used in the object-oriented constructs made available by <var class="product">SOUL</var>. These uses include:
from $functions and complex subroutines, Longstrings can, of course, also
be used in the object-oriented constructs made available by the [[Janus SOAP ULI]].
These uses include:
<ul>
<ul>
<li>As structure or class members.
<li>As structure or class members.
<li>As input parameters to both User Language and system methods.
<li>As input parameters to both user-defined and system methods.
<li>As the result (output value) of User Language and system methods.
<li>As the result (output value) of <var class="product">SOUL</var> and system methods.
</ul>
</ul>
In fact, '''all''' system methods ([[??]] refid=sysobj.) are
In fact, '''all''' system methods are <var>Longstring</var>-capable, so they behave, for the purposes of truncation and upgrading of <var>With</var> operations, the same as <var>Longstring</var>-capable $functions.  Therefore any <var>With</var> operation whose result is an input to a system method causes the <var>With</var> operation to be upgraded to a <var>Longstring</var> <var>With</var> operation. Similarly, any implicit truncation of the result of a system method results in request cancellation.
longstring-capable so behave, for the purposes of truncation and upgrading
of With operations, the same as longstring-capable $functions
This means the any With operations whose result is an input to a system
method causes the With operation to be upgraded to a longstring With operation.
Similarly, any implicit truncation of the result of a system method results in
request cancellation.
   
   
User Language methods, on the other hand, can declare their inputs and output as
User-defined methods, on the other hand, can declare their inputs and output as <var>Longstrings</var> or <var>Strings</var> of a specific length. <var>Longstring</var> inputs and results exhibit the same truncation and <var>With</var> operation behavior as string inputs to system methods. For example, consider the following function declaration in some class:
Longstrings or Strings of a specific length.
<p class="code">function encode(%in is longstring) is longstring
Longstring inputs and results exhibit the same truncation and With operation
</p>
behavior as string inputs to system methods.
For example, consider the following function declaration in some class:
<pre style="xmp">
    function encode(%in is longstring) is longstring
</pre>
If the method is invoked as follows:
If the method is invoked as follows:
<pre style="xmp">
<p class="code">%x = %foo:encode(%a with %b)
    %x = %foo:encode(%a with %b)
</p>
</pre>
the <code>%a with %b</code> is upgraded to a <var>Longstring</var> <var>With</var> operation, because its target (the <code>%in</code> parameter in the <code>Encode</code> function) is a <var>Longstring</var>. Similarly, if <code>%x</code> is a standard <var>String</var> variable with some specific length, the request will be cancelled if the result of the <code>Encode</code> method is longer than <code>%x</code>'s declared length.
the <tt>%a with %b</tt> is upgraded to a longstring With operation because
its target (the %in parameter in the Encode function) is a longstring.
Similarly, if %x is a standard String variable with some specific length,
the request will be cancelled if the result of the Encode method is longer
than %x's declared length.
   
   
String inputs and output, on the other hand, will behave like standard
<var>String</var> inputs and output, on the other hand, will behave like standard <var>String</var> variables for the purposes of truncation and <var>With</var> operation behavior. For example, consider the following function declaration in some class:
String variables for the purposes of truncation and With operation
<p class="code">function stubby(%in is string len 4) is string len 2
behavior.
</p>
For example, consider the following function declaration in some class:
<pre style="xmp">
    function stubby(%in is string len 4) is string len 2
</pre>
If the method is invoked as follows:
If the method is invoked as follows:
<pre style="xmp">
<p class="code">%x = %foo:stubby(%a with %b)
    %x = %foo:stubby(%a with %b)
</p>
</pre>
the <code>%a with %b</code> is not upgraded to a <var>Longstring</var> <var>With</var> operation because its target (the <code>%in</code> parameter in the <code>Stubby</code> function) is not a <var>Longstring</var>. Of course, if either <code>%a</code> or <code>%b</code> is a <var>Longstring</var>, then the <var>With</var> operation will be a <var>Longstring</var> <var>With</var> operation, anyway. If neither <code>%a</code> nor <code>%b</code> is a <var>Longstring</var> and <code>%a</code> contains <code>foo</code> and <code>%b</code>  contains <code>bar</code>, the result of the <var>With</var> operation would be <code>foobar</code> which would be silently truncated to <code>foob</code> when assigned to the input parameter <code>%in</code>.
the <tt>%a with %b</tt> is not upgraded to a longstring With operation
because its target (the %in parameter in the Stubby function) is not a
Similarly, if <code>%x</code> is a <code>String Len 1</code>, and the <code>Stubby</code> method returns <code>ok</code>, the <code>ok</code> would be silently truncated to <code>o</code> when assigned to <code>%x</code>.  In fact, if the <code>Stubby</code> method had the following statement:
longstring.
<p class="code">return 'Not OK'
Of course, if either %a or %b is a longstring then the With operation
</p>
will be a longstring With operation, anyway.
the return value would be silently truncated to <code>No</code> before being assigned to the target variable, even if the target variable for the <code>Stubby</code> invocation was longer than two bytes.  On the other hand, if the <code>Stubby</code> method had the following statement:
If neither %a nor %b is a Longstring and %a contains &ldquo;foo&rdquo; and
<p class="code">return %schooner
%b contains &ldquo;bar&rdquo;, the result of the With operation would be
</p>
&ldquo;foobar&rdquo; which would be silently truncated to &ldquo;foob&rdquo; when
and <code>%schooner</code> was a <var>Longstring</var> with a value longer than two bytes, the request would be cancelled because of <var>Longstring</var> trucation, even if the target variable for the Stubby invocation was, itself, a <var>Longstring</var>.
assigned to the input parameter %in.
   
   
Similarly, if %x is a String Len 1, and the Stubby method returns &ldquo;OK&rdquo;,
Finally, support for [[Intrinsic classes|intrinsic]] methods was introduced. As for all other system method inputs, intrinsic <var>String</var> system methods all behave as if their method string was a <var>Longstring</var>. For example, in:
the &ldquo;OK&rdquo; would be silently truncated to &ldquo;O&rdquo; when assigned to %x.
<p class="code">%x = (%a with %b):right(40, pad='*')
In fact, if the Stubby method had the following statement:
</p>
<pre style="xmp">
the <code>%a with %b</code> would be upgraded to a <var>Longstring</var> <var>With</var>, even if neither <code>%a</code> nor <code>%b</code> were a <var>Longstring</var>.
    return 'Not OK'
</pre>
the return value would be silently truncated to &ldquo;No&rdquo; before being
assigned to the target variable, even if the target variable for the
Stuff invocation was longer than two bytes.
On the other hand, if the Stubby method had the following statement:
<pre style="xmp">
    return %schooner
</pre>
and %schooner was a longstring with a value longer than two bytes, the
request would be cancelled because of longstring trucation, even if the
target variable for the Stubby invocation was, itself, a longstring.
   
   
Finally, ''Sirius Mods'' version 7.2 introduced support for [[Intrinsic classes|intrinsic]] methods.
For intrinsic and other methods, the fact that all string inputs are treated as <var>Longstrings</var> does not mean that the method will necessarily accept arbitrarily long values.  In fact, it's quite possible for a parameter to be restricted to being a single character. For example the intrinsic <var>String</var> <var>[[Right_(String_function)|Right]]</var> method has a named parameter called <var>Pad</var> that cannot be longer than one byte:
As for all other system method inputs, intrinsic String system methods all
<p class="code">%y = %x:right(50, pad=%pad)
behave as if their method string was a longstring.
</p>
For example, in
In this example, if <code>%pad</code> had a value longer than a single byte, the request would be cancelled.  This, in spite of the fact that the parameter behaves like a <var>Longstring</var> parameter.
<pre style="xmp">
    %x = (%a with %b):right(40, pad='*')
</pre>
the <tt>%a with %b</tt> would be upgraded to a longstring With, even
if neither %a nor %b were a longstring.
   
   
For intrinsic and other methods, the fact that all string inputs are
treated as longstrings does not mean that the method will necessarily
accept arbitrarily long values.
In fact, it's quite possible for a parameter to be restricted to being
a single character.
For example the intrinsic String Right method has a named parameter
called Pad that cannot be longer than one byte:
<pre style="xmp">
    %y = %x:right(50, pad=%pad)
</pre>
In this example, if %pad had a value longer than a single byte, the request
would be cancelled.
This, in spite of the fact that the parameter behaves like a longstring
parameter.
==Longstring performance==
==Longstring performance==
The first 255 bytes of Longstrings are always kept in STBL, so the code
The first 255 bytes of <var>Longstrings</var> are always kept in STBL, so the code path for manipulating a <var>Longstring</var> variable with a value that is shorter than 256 bytes is usually identical to or only slightly greater than the code path for manipulating a regular <var>String</var> variable. A <var>Longstring</var> variable always has 257 bytes of STBL allocated for it at compile time, and it requires somewhat more VTBL space than a regular <var>String</var> variable. <var>Longstring</var> arrays require 257 bytes of STBL per element and some VTBL space per element. This is unlike regular <var>String</var> variables, which require no per-element VTBL space.
path for manipulating a Longstring variable with a value that is shorter than
256 bytes is usually identical to or only slightly greater than the code
path for manipulating a regular String variable.
A Longstring variable always has 257 bytes of STBL allocated for it at compile
time, and it requires somewhat more VTBL space than a regular String variable.
Longstring arrays require 257 bytes of STBL per element and some VTBL space
per element.
This is unlike regular String variables, which require no per-element VTBL space.
   
   
Yet because of the minor code path issues and the table space issues just
Yet because of the minor code path issues and the table space issues just mentioned, it is probably not a good idea to use <var>Longstring</var> variables in contexts where the values are never expected to exceed 255 bytes, unless performance is not a major concern, or unless the extra error detection for <var>Longstring</var> truncation is desired.
mentioned, it is probably not a good idea to use Longstring variables
in contexts where the values are never expected to exceed 255 bytes, unless
performance is not a major concern, or unless the extra error detection for
Longstring truncation is desired.
   
   
Of course, variables that need to hold more than 255 bytes of data must be
Of course, variables that need to hold more than 255 bytes of data must be declared as <var>Longstrings</var>, and any data beyond 255 bytes gets stored in CCATEMP. This means manipulation of very long <var>Longstring</var> variables could result in significant logical and even physical CCATEMP I/O and higher CCATEMP utilization. In addition, very long <var>Longstring</var> values means large quantities of data need to be scanned or copied, which in itself could be a source of CPU overhead. This is not to say that long values should not be used in <var>Longstrings</var> in applications; quite the contrary. <var>Longstrings</var> are designed for applications that require long values, and the performance of <var>Longstring</var> manipulation, even for very long values, will generally be pretty good.
declared as Longstrings, and any data beyond 255 bytes gets stored in CCATEMP.
This means manipulation of very long Longstring variables could result in
significant logical and even physical CCATEMP I/O and higher CCATEMP utilization.
In addition, very long Longstring values means large quantities of data need
to be scanned or copied, which in itself could be a source of CPU overhead.
This is not to say that long values should not be used in Longstrings in
applications; quite the contrary.
Longstrings are designed for applications that require long values,
and the performance of Longstring manipulation, even for very long values, will
generally be pretty good.
   
   
Nevertheless, it is a good idea to avoid unnecessary, very long, Longstring
Nevertheless, it is a good idea to avoid unnecessary, very long, <var>Longstring</var> operations &mdash; unnecessary because the application does not require it, or because the operation has already been performed once. Regarding the latter, if a very long <var>Longstring</var> operation occurs in a loop, it would be better to move the operation outside the loop if possible, or to only do it conditionally if it's really required and hasn't already been performed in a previous iteration of the loop.
operations &mdash; unnecessary because the application does not require it,
or because the operation has already been performed once.
Regarding the latter, if a very long Longstring operation
occurs in a loop, it would be better to move the operation outside the
loop if possible, or to only do it conditionally if it's really required and
hasn't already been performed in a previous iteration of the loop.
   
   
There is relatively little space overhead for the part of a Longstring that
There is relatively little space overhead for the part of a <var>Longstring</var> that resides in <code>CCATEMP - 6124</code> of the 6144 bytes on each CCATEMP page actually hold data. So the first 255 bytes of a 60,000 byte long <var>Longstring</var> value are stored in STBL, and the remaining <code>60,000-255</code> bytes are stored on <code>(60,000-255)/6124</code>, or 10 CCATEMP pages. Intermediate results will also use some CCATEMP space, though this usage will typically be short-lived &mdash; the space being released as soon as the statement completes. So, for example, if <code>%a</code> and <code>%b</code> are <var>Longstring</var> variables each with 90,000 bytes of data, and <code>%c</code> is a <var>Longstring</var> variable, the following statement will temporarily require an extra 120,000 bytes of space (255 of them in STBL) to hold the result of the <code>%a with %b</code> operation:
resides in CCATEMP &mdash; 6124 of the 6144 bytes on each CCATEMP page actually
<p class="code">%c = $lstr_substr(%a with %b, 60000, 60000)
hold data.
</p>
So the first 255 bytes of a 60,000 byte long Longstring value are stored in
STBL, and the remaining 60,000-255 bytes are stored on (60,000-255)/6124, or
10 CCATEMP pages.
Intermediate results will also use some CCATEMP space, though this usage will
typically be short-lived &mdash; the space being released as soon as the
statement completes.
So, for example, if %a and %b are Longstring variables each with 90,000 bytes
of data, and %c is a Longstring variable, the following statement
will temporarily require an extra 120,000 bytes of space (255 of them in
STBL) to hold the result of the <tt>%a with %b</tt> operation:
<pre style="xmp">
    %c = $lstr_substr(%a with %b, 60000, 60000)
</pre>
   
   
Because concatenation of one string to another is such a common operation,
Because concatenation of one string to another is such a common operation, assignment of the concatenation of a <var>Longstring</var> variable and another string to the first <var>Longstring</var> variable is highly optimized. For example, if <code>%long</code> is a <var>Longstring</var> with 50,000 bytes of data:
assignment of the concatenation of a Longstring variable and another
<p class="code">%long = %long with '!'
string to the first Longstring variable is highly optimized.
</p>
For example, if %long is a longstring with 50,000 bytes of data:
would simply tack an exclamation mark on the end of <code>%long</code> rather than copying all of <code>%long</code> and an exclamation mark and then assigning that string to <code>%long</code>. Note, however, that this optimization is only performed if a single string is being concatenated with the current value of the target variable. That is, in the following:
<pre style="xmp">
<p class="code">%long = %long with '+' with $time
    %long = %long with '!'
</p>
</pre>
an intermediate <var>Longstring</var> containing the concatenation of <code>%long</code> and a plus sign will be created. That intermediate <var>Longstring</var> will then be concatenated with the current time (as returned by $time) and then assigned to <code>%long</code>. This means that the current contents of <code>%long</code> end up being copied twice in such a case which, if <code>%long</code> contains 50,000 bytes, is 100,000 bytes worth of data movement which will be quite expensive, by any standard. Fortunately, it is easy to &ldquo;help out&rdquo; the compiler to make this operation more efficient:
would simply tack an exclamation mark on the end of %long rather than
<p class="code">%long = %long with ('+' with $time)
copying all of %long and an exclamation mark and then assigning that
</p>
string to %long.
In this case, the concatenation of the plus sign and the current time are assigned to an intermediate work <var>Longstring</var>. Then, because this intermediate value is simply being concatenated with <code>%long</code> and then assigned back to <code>%long</code>, the concatenation optimization results in the intermediate work <var>Longstring</var> simply being tacked on to the end of <code>%long</code>, requiring almost no data movement, at all. Even in cases where the concatenation can't be optimized to an append operation, it is usually a good idea to isolate concatenations involving relatively small values from a preceding one involving a (potentially) very long one.
Note, however, that this optimization is only performed if a single string
is being concatenated with the current value of the target variable.
That is, in the following:
<pre style="xmp">
    %long = %long with '+' with $time
</pre>
an intermediate Longstring containing the concatenation of %long and a plus
sign will be created.
That intermediate Longstring will then be concatenated with the current time
(as returned by $time) and then assigned to %long.
This means that the current contents of %long end up being copied twice in
such a case which, if %long contains 50,000 bytes, is 100,000 bytes worth of
data movement which will be quite expensive, by any standard.
Fortunately, it is easy to &ldquo;help out&rdquo; the compiler to make this operation
more efficient:
<pre style="xmp">
    %long = %long with ('+' with $time)
</pre>
In this case, the concatenation of the plus sign and the current time are
assigned to an intermediate work Longstring.
Then, because this intermediate value is simply being concatenated with
%long and then assigned back to %long, the concatenation optimization results
in the intermediate work Longstring simply being tacked on to the end of
%long, requiring almost no data movement, at all.
Even in cases where the concatenation can't be optimized to an append
operation, it is usually a good idea to isolate concatenations involving
relatively small values from a preceding one involving a (potentially)
very long one.
   
   
For example, if a longstring with a potentially large value is being bracketed
For example, if a <var>Longstring</var> with a potentially large value is being bracketed by the date and time, using a greater-than and less-than symbol as separators, the following:
by the date and time, using a greater-than and less-than symbol as separators,
<p class="code">%long = $date with '>' with %long with ('<' with $time)
the following:
</p>
<pre style="xmp">
    %long = $date with '>' with %long with ('<' with $time)
</pre>
will be more efficient than
will be more efficient than
<pre style="xmp">
<p class="code">%long = $date with '>' with %long with '<' with $time
    %long = $date with '>' with %long with '<' with $time
</p>
</pre>
 
[[Category:SOUL]]
[[Category:Overviews]]
[[Category:Overviews]]
[[Category:User Language syntax enhancements]]

Latest revision as of 20:42, 27 March 2015

As of Model 204 version 7.5, Longstrings appear as a native Model 204 datatype and are defined in the same way as other variable datatypes:

%name is longstring

Longstring variables are largely interchangeable with String variables, with the exception that a Longstring can have a length up to 2**31-1 bytes, while String variables have a maximum length of 255 bytes. The Variables Are statement and the VTYPE parameter do not allow Longstring to be set as a default type, so all Longstring variables must be explicitly declared as such. Longstring variables can be defined as Common and as subroutine parameters, but there is currently no support for Static Longstring variables. Longstrings may be specified in an Initial clause.

Like other %variables, a Longstring cannot be declared as Global on its declaration. However, a Longstring %variable can be dynamically bound to a global Longstring with the $Lstr_global function, and it can be dynamically bound to a session global Longstring with the $Lstr_session function.

The value of a global or session Longstring can also be retrieved with $Lstr_global_get or $Lstr_session_get, and it can be updated with $Lstr_global_set or $Lstr_session_set.

Longstrings can also be declared as arrays:

%heaps is longstring array(10)

The Longstring datatype is not supported inside images. However, image items with length greater than 255 are now supported:

image foo bar is string len 300 end image

While such image items can't have arbitrary lengths up to 2**31-1 like other Longstring variables, they exhibit the same behavior as other Longstring variables in request cancellation in the case of truncation, and in upgrading With operations to Longstring With operations.

While it might be tempting to redefine many or all String Len 255 variables as Longstring, there are a few subtle issues discussed in this chapter that might result in problems should this be done. This is not to say that many such variables shouldn't be converted to Longstring, but it might not be as simple as a one-line editing change.

Truncation

One key difference between a Longstring and a regular String is the default behavior of Longstring truncation: any truncation on assignment from a Longstring, Longstring $function, or Longstring With operation causes request cancellation. Two examples of the application of this rule follow:

  • An assignment to a String variable from a Longstring results in request cancellation if the value of the Longstring exceeds the declared String length. This cancellation can happen even if the Longstring is less than 255 bytes long. If, say, variable %short were defined as String Len 55, and a Longstring variable called %long contained 60 bytes of data, an assignment like the following results in request cancellation:

    %short = %long

    Yet, you can successfully use an intermediate assignment to a String Len 255 variable (called %medium in the following example) followed by the assignment of that variable to %short:

    %medium = %long %short = %medium

    As a result, the last five bytes of the value originally held in %long are silently truncated and assigned to %short.

    Of course, since a regular String can never be longer than 255 bytes, any assignment from a Longstring longer than 255 bytes to a regular String will result in request cancellation. There are several ways around this problem, but the simplest is to use the $Str function to silently truncate a Longstring at 255 bytes or whatever is required for assignment to its target. Effectively, the $str function tells Model 204 to treat the Longstring as it would a regular String for truncation purposes, and the assignment succeeds:

    %short = $str(%long)

  • Although the Longstring datatype is not supported inside images, you can assign from a Longstring to an image item. However, assigning to an image item a Longstring variable that has a value that ends with one or more of the target image item's Pad character (which defaults to the space character) where the target image item is not NoStrip results in an implicit truncation — the trailing pad characters are effectively removed. Since implicit truncation of a Longstring value on assignment is not allowed, this results in request cancellation. For example, the following request, which prints the result They're different, shows the image item truncation for an assignment from a String:

    begin %str is string len 8 image foo x is string len 8 end image prepare image foo %str = 'Blank ' %foo:x = %str if %foo:x ne %str then print 'They''re different' end if end

    If %str is declared as a Longstring above, however, the request is cancelled by a Longstring truncation error. But if %str is declared as a Longstring, and if %foo:x = %str is replaced by %foo:x = $str(%str), the request succeeds.

Using $str to correct for this Longstring truncation behavior is not always appropriate, though. The use of $str might be viewed as a continuation of the dubious Model 204 programming practice of truncation by assignment, so it might be avoided or at least used as a last result as a matter of policy. In fact, converting many String variables to Longstring might be viewed as a way of detecting possible unintentional truncation in existing applications, although there are some subtle issues one should be aware of before embarking on such an enterprise.

For additional discussion of these truncation issues, see Changing Longstring truncation behavior.

Longstrings in expressions

Like Strings, a Longstring variable can be used in SOUL expressions, as operands or as input to $functions. Longstring variables can also be used as input to intrinsic methods (as can any other string or numeric datatype).

One important point to keep in mind is that Model 204's expression processing behavior is not changed at all unless Longstring variables or $functions are used, and then only changed in the statements where they are actually used. So the effect of any use of Longstring variables or $functions is limited to the statements that use them.

Concatenation: the With operator

SOUL expressions can have embedded sub-expressions or simply expressions. For example, in

%x = %a with %b with %c

the expression %a with %b is evaluated and an intermediate result is produced. This intermediate result is then used as the first operand in a With operation with %c. With no Longstrings involved, string expressions are silently truncated at 255 bytes, including when producing an intermediate result. So, in the above example, if %a and %b were each 200 bytes long, the intermediate result of %a with %b would be truncated at the 55th byte of %b, and the with %c would simply drop %c, since the intermediate result that was the first operand of the with %c would already be 255 bytes long. In this case, %x would end up containing all of %a, the first 55 bytes of %b, and none of %c. Fortunately, the results would be the same even if the expression were written as follows:

%x = %a with (%b with %c)

It is still worth working this out mentally to develop a good feel for how intermediate expression results are processed in SOUL.

In any case, the With operation behaves differently in the presence of Longstrings. Specifically, if either operand of a With operation is a Longstring, the intermediate result of the operation is also a Longstring. If, in the above example, %a is a Longstring and %b and %c are regular Strings, the result of %a with %b is a 400-byte Longstring. When this 400-byte intermediate result Longstring is then concatenated using the With operation on %c, the result is a Longstring of length 400 plus the length of %c. If the target of this expression, %x, is a regular String, this causes a request-cancelling truncation error.

In addition, if the target of a With operation is a Longstring, the With operation produces a Longstring result, even if none of the operands are themselves Longstrings. For example, if %x is a Longstring, and %a and %b are String Len 255, each with 200 bytes of data:

%x = %a with %b

%x will be 400 bytes long, containing all of %a concatenated with all of %b. If either of the operands of such a With clause is itself an expression, that expression is treated as if its target were also a Longstring. For example, if in

%x = %a with (%b with %c)

%x is a Longstring, and %a, %b, and %c are String Len 255, each with 200 bytes of data, %x will end up being 600 bytes long, containing all of %a concatenated with all of %b with all of %c. This works the same way if the assignment is written as either of the following:

%x = (%a with %b) with %c %x = %a with %b with %c

Expression processing is the same for string literals, so if %x is a Longstring, and %a is a String Len 255 with 255 bytes of data, the following assigns 258 bytes to %x:

%x = %a with '...'

Another way of looking at this is that in the presence of Longstring variables, whether as the target or as one of the operands, all concatenation operations are "upgraded" to be Longstring concatenations. One side-effect of this is that if an operand of a concatenation is a Longstring, Longstring truncation rules apply to the ultimate target of the assignment. For example, %long is a Longstring containing 'Testing...', and %short is a String Len 12:

%short = (%long with '123') with '456'

The result is a request-cancelling truncation error, because the result of all the concatenation operations is treated as a Longstring, albeit one with less than 255 bytes of data. The cancellation can be avoided with the use of the $Str function, as in the following:

%short = $str((%long with '123') with '456')

Though, again, this is simply carrying on the dubious SOUL programming practice of truncation by assignment.

Note that the "upgrading" of With operations to Longstring With operations is not induced by a Longstring variable or expression inside a $function call. For example, %long is a Longstring with 30 bytes of data, and %short is String Len 10:

%short = '*' with $substr(%long, 1, 20)

%short ends up containing an asterisk followed by the first 9 bytes of %long. The assignment is made with silent truncation, because the result of a non-longstring-capable $function is always treated as a regular String for the purposes of assignment and With processing.

Numeric conversion

In a context where a Longstring is automatically converted to a numeric datatype, a request-cancelling truncation error occurs if the Longstring variable is longer than 255 bytes, even if most or all of these bytes are leading zeros. For example, %long is a Longstring with 300 zeros followed by a one:

%a = %long + 1

The result is a request-cancelling truncation error. Fortunately, it's not likely that one is likely to encounter numbers with greater than 255 digits in them. Longstring data used in a numeric context will undergo the dubious automatic conversion of invalid numeric data into a zero in the same way as String data.

If the result of a numeric operation on a Longstring is then used in a With operation, the With operation is not upgraded to a Longstring With operation, because the intermediate result of the numeric operation is not a Longstring but a numeric, which is then automatically converted to a String intermediate result. For example, %long is a Longstring containing 99, and %short is String Len 2:

%short = %long + 1

The result is not a request cancellation; instead, a M204.0552: VARIABLE TOO SMALL FOR RESULT message is issued, and an asterisk ( * ) is assigned to %short. Similarly, with these definitions and values:

%short = (%long + 1) with '*'

The result is a 10 being assigned to %short with no warnings, exactly the behavior if %long were a String Len 255.

Longstrings not allowed as index %variable in For statement

One case of automatic conversion to numeric where String and Longstring behaviors differ is index loop control variables. For example, the following loop is valid if %s is a String, but it results in a compilation error if %s is a Longstring:

for %s from 1 to 2 print %s end for

Comparisons

Comparison operations such as Eq, Lt, Le, >, <, etc. will perform Longstring comparisons if either of the operands is a Longstring, that is, comparison operations involving Longstring operands behave pretty much as expected.

Longstrings and $functions

Longstrings can be used as inputs to $functions. As mentioned before, if a Longstring expression is assigned to a regular String, a request-cancelling truncation error will occur if the target String variable is not big enough to hold the source Longstring. Request-cancelling truncation errors also occur if a Longstring that is longer than 255 bytes is passed to a non-Longstring-capable $function. For example:

print $substr(%long, 1, 50)

would result in request cancellation if %long was longer than 255 bytes. One way around this would be to use the $str function to tell SOUL to treat the Longstring as a String in this case as in:

print $substr($str(%long), 1, 50)

though a better approach in this case would be to use the Longstring-capable sub-stringing function, $Lstr_Substr, as in:

print $lstr_substr(%long, 1, 50)

The Longstring-capable $functions in this manual typically start with "$lstr", end in "_lstr" (such as $ListInf_Lstr), or belong to a family of $functions (such as the $Regex family) that are completely Longstring-capable. Longstring-capable $functions specific to other Sirius products (like the Janus Web Server and Janus Sockets $functions) typically do not use an "lstr" prefix or suffix, but they are identified in their documentation as Longstring-capable.

In addition to their ability to process more than 255-byte long strings, Longstring-capable $functions have some special characteristics pertaining to expression handling:

  • A Longstring-capable $function that returns a string result (as opposed to one that returns a numeric result such as $Lstr_Index) is treated as a Longstring expression for the purposes of truncation and for the upgrading of With operations to Longstring With operations. For example, if %short is String Len 5 and %junk contains Some text:

    %short = $lstr_substr(%junk, 1, 7)

    would result in a request-cancelling truncation error. This is true whether %junk was a Longstring or a regular String, though the latter illustrates the point that regular String variables (or expressions) can be used as input to Longstring-capable $functions. If %junk contained 300 bytes of data:

    %out = $lstr_substr(%junk, 1, 255) with '*'

    would result in a request-cancelling truncation error if %out were a regular String variable, and would result in 256 bytes, the last byte being an asterisk, being assigned to %out if %out were a Longstring.

  • All string arguments to Longstring-capable $functions are treated as Longstring targets for the purpose of upgrading With operations to Longstring With operations. For example, since $Lstr_Right is Longstring-capable, With in its string argument is upgraded to Longstring. So, if %medium is a string containing 252 or more characters, then:

    $lstr_right(%medium with '****', 256)

    returns the right-most 252 bytes of %medium, concatenated with four asterisks.

    Note: This behavior does not imply that Longstring-capable $functions will always accept strings longer than 255 bytes as their arguments. For example, $Lstr_Index will not accept strings longer than 255 bytes as its second argument (the string being searched for), and $Lstr_Right and $Lstr_Left won't accept any strings longer than a single byte for their third argument (the pad character). This $function-specific behavior does not affect the treatment of the $function results or arguments as Longstring data for expression handling purposes.

Longstrings and complex subroutines

Complex subroutine parameters, both Input and Output (or InOut, which means the same thing as Output) can be defined as Longstring, as in either of the following:

subroutine chop(%x is longstring input) subroutine chop(%x is longstring output)

In addition, Longstring variables and expressions can be passed as parameters to complex subroutines. For Output parameters, Longstring issues are fairly straightforward. There are two restrictions:

  • You cannot pass a Longstring as a parameter to a subroutine that defines the parameter as String Output.
  • You cannot pass a regular String as a parameter to a subroutine that defines the parameter as Longstring Output.

For Input parameters, things are somewhat more complex, because:

  • Mismatches in String and Longstring datatypes are allowed between passed value and declared parameter.
  • Input parameters can actually receive the results of expressions as their inputs.

While for Input parameters, Strings and Longstrings may be passed interchangeably as Longstring and String parameters, subroutine declaration statements (Declare Subroutine) must exactly match the parameter types on the actual subroutine definitions. That is, given a declaration like this:

declare subroutine tender(longstring)

One cannot later specify the subroutine as

subroutine tender(%mercy is string len 255)

If a Longstring parameter is passed to a subroutine with the parameter defined as String Input, the request is cancelled if the Longstring value is longer than the length of the String Input parameter (as always, this will happen even if the Longstring value is shorter than 255 bytes). This mimics the behavior of an assignment of a Longstring variable to a regular String variable.

If a Longstring array is passed to a subroutine with the parameter defined as a String Array, the request is cancelled if any element of the Longstring array is longer than 255 bytes, whether or not that element is ever referenced in the complex subroutine. Outside the functionality issues raised by this limitation, it also suggests an inefficiency in passing a Longstring array to a String parameter: the inefficiency of scanning the array for values longer than 255 bytes. Because of both the functionality and efficiency issues, it is probably best to avoid passing a Longstring array to a String array parameter if at all possible.

Because a String variable or a literal can always fit into a Longstring parameter, there are no truncation or other issues associated with passing String variables and literals as parameters defined as Longstring.

If a call to a complex subroutine contains a With operation for a Longstring parameter, that With operation is “upgraded” to a Longstring With operation, whether or not any of the operands are themselves Longstrings, exactly as if the target of a With operation were a Longstring variable. As everywhere else, a With operation involving a Longstring in a subroutine call will also be upgraded to a Longstring With operation, meaning that no truncation will occur at 255 bytes, and that if the result is longer than the length of the target String parameter, the request will be cancelled.

Changing Longstring truncation behavior

While it is sometimes convenient that Model 204 silently truncates string data on assignment to a variable or intermediate result, it has also been the source of a vast number of incorrect User Language programs. Because of this history and the higher chance of unintentional truncation from a Longstring source, the default behavior for Longstrings is that any truncation on assignment from a Longstring, Longstring $function, or Longstring With operation causes request cancellation. This behavior should facilitate "cleaner" and more robust code — where truncation is intended, it is explicitly indicated (for example, with $Lstr_Substr, $Lstr_Left, or $Str).

Nevertheless, since this cancellation on truncation behavior is inconsistent with Model 204's behavior for strings, it might be viewed as undesirable. If you want to prevent request continuation on truncation of a Longstring source in an Online, you can MSGCTL the error message for Longstring truncation to NOCAN.

The three messages that you might need to MSGCTL are MSIR.0680, MSIR.0681, and MSIR.0682.

  • MSIR.0680 is issued if the SIRFACT system parameter X'01' bit is set, or if the Model 204 DEBUGUL user parameter is set to a non-zero value.
  • MSIR.0681 is issued for requests entered at command level rather than run from a procedure.
  • MSIR.0682 is issued otherwise.

Issuing MSGCTL for these messages to NOCAN might prevent request cancellation from the occasional Longstring truncation, but if silent truncation of Longstrings is heavily used as a programming “technique” inside a request, the user running the request will quickly be restarted with a “TOO MANY ERRORS” message. To prevent this, MSGCTL the indicated messages to NOCOUNT.

Even then, a large number of these messages might be viewed as being annoying, at best, if the intent is to simply ignore silent truncation of Longstrings. In that case, MSGCTL the indicated messages to NOTERM and maybe even NOAUDIT (if this latter is available). Even then, there will be a little Model 204 processing overhead in producing the messages that are everywhere suppressed, so it would still generally be more efficient to truncate Longstrings explicitly using $Str, $Lstr_Substr or $Lstr_Left.

If you use the default Longstring behavior, at least in the development and test environments, you should find it will rapidly catch potential problems and so produce more bug-free code. The request cancellation due to Longstring truncation should therefore be a benefit. In those places that “truncation by assignment” is used in the code, if you change any of the types in the source expression and discover request cancellation, you will probably decide it is better to use an explicit truncation construct, rather than to retain this dubious coding practice.

If there is concern about request cancellation in a production region, you can MSGCTL the indicated messages to NOTERM in production. However, such a switch allows a production request to continue after an unanticipated Longstring truncation, so it could result in data corruption or a more subtle error later in the request that will cause request cancellation anyway, but be more difficult to diagnose.

Longstrings and the Print, Html, and Text statements

Using Longstring expressions in the Print statement works largely "as expected": given the constraints of LOBUFF, OUTCCC, and OUTMRL, and other output target specific parameters — the values of Longstrings are simply displayed to the output target. One minor exception to this is that the To clause on the Print statement is not supported for Longstrings.

It should also be kept in mind that the With keyword in Print statements is not the With concatenation operator, although the result is usually the same as if it were. Specifically, the With keyword results in the part before the With being printed, followed by the part after. This means that if two regular String variables, each with 255 bytes of data in them, are printed as follows:

print %a with %b

510 bytes of data would be printed, which is different from the With operator in an assignment like the following, which will result in %c simply containing the contents of %a, because the With operation results in truncation at 255 bytes:

%c = %a with %b

This difference between the With keyword in the Print statement and the With operator in expressions predates Longstrings and is, in fact, more significant with regular Strings than with Longstrings.

The HTML and Text statements allow variable values or expression results to be embedded inside the expression start and end characters (defaults: { and }). As with the Print statement, this works pretty much “as expected” for Longstrings: the contents of the Longstring variable or the result of a Longstring expression will be displayed in their entirety within display parameter constraints. The only Longstring related issue for HTML statement expressions is that if an expression is not a Longstring variable, a Longstring $function, or a With operation involving one or more of these, the expression is assumed to be a regular String expression that undergoes silent truncation at 255 bytes. For example, if %a and %b were regular String variables both containing 200 bytes of data, the following would truncate the concatenation of %a and %b at 255 bytes:

text data The result is {%a with %b}

To get around this, one can force the With operation to be upgraded to a Longstring With operation, using:

text data The result is {$lstr(%a with %b)}

However, the use of With operations in Html statements is generally silly, since the same result can be obtained by simply entering each operand in the With expression as a separate expression as in:

text data The result is {%a}{%b}

Longstrings and methods

In addition to their use as local variables and as inputs to or outputs from $functions and complex subroutines, Longstrings can, of course, also be used in the object-oriented constructs made available by SOUL. These uses include:

  • As structure or class members.
  • As input parameters to both user-defined and system methods.
  • As the result (output value) of SOUL and system methods.

In fact, all system methods are Longstring-capable, so they behave, for the purposes of truncation and upgrading of With operations, the same as Longstring-capable $functions. Therefore any With operation whose result is an input to a system method causes the With operation to be upgraded to a Longstring With operation. Similarly, any implicit truncation of the result of a system method results in request cancellation.

User-defined methods, on the other hand, can declare their inputs and output as Longstrings or Strings of a specific length. Longstring inputs and results exhibit the same truncation and With operation behavior as string inputs to system methods. For example, consider the following function declaration in some class:

function encode(%in is longstring) is longstring

If the method is invoked as follows:

%x = %foo:encode(%a with %b)

the %a with %b is upgraded to a Longstring With operation, because its target (the %in parameter in the Encode function) is a Longstring. Similarly, if %x is a standard String variable with some specific length, the request will be cancelled if the result of the Encode method is longer than %x's declared length.

String inputs and output, on the other hand, will behave like standard String variables for the purposes of truncation and With operation behavior. For example, consider the following function declaration in some class:

function stubby(%in is string len 4) is string len 2

If the method is invoked as follows:

%x = %foo:stubby(%a with %b)

the %a with %b is not upgraded to a Longstring With operation because its target (the %in parameter in the Stubby function) is not a Longstring. Of course, if either %a or %b is a Longstring, then the With operation will be a Longstring With operation, anyway. If neither %a nor %b is a Longstring and %a contains foo and %b contains bar, the result of the With operation would be foobar which would be silently truncated to foob when assigned to the input parameter %in.

Similarly, if %x is a String Len 1, and the Stubby method returns ok, the ok would be silently truncated to o when assigned to %x. In fact, if the Stubby method had the following statement:

return 'Not OK'

the return value would be silently truncated to No before being assigned to the target variable, even if the target variable for the Stubby invocation was longer than two bytes. On the other hand, if the Stubby method had the following statement:

return %schooner

and %schooner was a Longstring with a value longer than two bytes, the request would be cancelled because of Longstring trucation, even if the target variable for the Stubby invocation was, itself, a Longstring.

Finally, support for intrinsic methods was introduced. As for all other system method inputs, intrinsic String system methods all behave as if their method string was a Longstring. For example, in:

%x = (%a with %b):right(40, pad='*')

the %a with %b would be upgraded to a Longstring With, even if neither %a nor %b were a Longstring.

For intrinsic and other methods, the fact that all string inputs are treated as Longstrings does not mean that the method will necessarily accept arbitrarily long values. In fact, it's quite possible for a parameter to be restricted to being a single character. For example the intrinsic String Right method has a named parameter called Pad that cannot be longer than one byte:

%y = %x:right(50, pad=%pad)

In this example, if %pad had a value longer than a single byte, the request would be cancelled. This, in spite of the fact that the parameter behaves like a Longstring parameter.

Longstring performance

The first 255 bytes of Longstrings are always kept in STBL, so the code path for manipulating a Longstring variable with a value that is shorter than 256 bytes is usually identical to or only slightly greater than the code path for manipulating a regular String variable. A Longstring variable always has 257 bytes of STBL allocated for it at compile time, and it requires somewhat more VTBL space than a regular String variable. Longstring arrays require 257 bytes of STBL per element and some VTBL space per element. This is unlike regular String variables, which require no per-element VTBL space.

Yet because of the minor code path issues and the table space issues just mentioned, it is probably not a good idea to use Longstring variables in contexts where the values are never expected to exceed 255 bytes, unless performance is not a major concern, or unless the extra error detection for Longstring truncation is desired.

Of course, variables that need to hold more than 255 bytes of data must be declared as Longstrings, and any data beyond 255 bytes gets stored in CCATEMP. This means manipulation of very long Longstring variables could result in significant logical and even physical CCATEMP I/O and higher CCATEMP utilization. In addition, very long Longstring values means large quantities of data need to be scanned or copied, which in itself could be a source of CPU overhead. This is not to say that long values should not be used in Longstrings in applications; quite the contrary. Longstrings are designed for applications that require long values, and the performance of Longstring manipulation, even for very long values, will generally be pretty good.

Nevertheless, it is a good idea to avoid unnecessary, very long, Longstring operations — unnecessary because the application does not require it, or because the operation has already been performed once. Regarding the latter, if a very long Longstring operation occurs in a loop, it would be better to move the operation outside the loop if possible, or to only do it conditionally if it's really required and hasn't already been performed in a previous iteration of the loop.

There is relatively little space overhead for the part of a Longstring that resides in CCATEMP - 6124 of the 6144 bytes on each CCATEMP page actually hold data. So the first 255 bytes of a 60,000 byte long Longstring value are stored in STBL, and the remaining 60,000-255 bytes are stored on (60,000-255)/6124, or 10 CCATEMP pages. Intermediate results will also use some CCATEMP space, though this usage will typically be short-lived — the space being released as soon as the statement completes. So, for example, if %a and %b are Longstring variables each with 90,000 bytes of data, and %c is a Longstring variable, the following statement will temporarily require an extra 120,000 bytes of space (255 of them in STBL) to hold the result of the %a with %b operation:

%c = $lstr_substr(%a with %b, 60000, 60000)

Because concatenation of one string to another is such a common operation, assignment of the concatenation of a Longstring variable and another string to the first Longstring variable is highly optimized. For example, if %long is a Longstring with 50,000 bytes of data:

%long = %long with '!'

would simply tack an exclamation mark on the end of %long rather than copying all of %long and an exclamation mark and then assigning that string to %long. Note, however, that this optimization is only performed if a single string is being concatenated with the current value of the target variable. That is, in the following:

%long = %long with '+' with $time

an intermediate Longstring containing the concatenation of %long and a plus sign will be created. That intermediate Longstring will then be concatenated with the current time (as returned by $time) and then assigned to %long. This means that the current contents of %long end up being copied twice in such a case which, if %long contains 50,000 bytes, is 100,000 bytes worth of data movement which will be quite expensive, by any standard. Fortunately, it is easy to “help out” the compiler to make this operation more efficient:

%long = %long with ('+' with $time)

In this case, the concatenation of the plus sign and the current time are assigned to an intermediate work Longstring. Then, because this intermediate value is simply being concatenated with %long and then assigned back to %long, the concatenation optimization results in the intermediate work Longstring simply being tacked on to the end of %long, requiring almost no data movement, at all. Even in cases where the concatenation can't be optimized to an append operation, it is usually a good idea to isolate concatenations involving relatively small values from a preceding one involving a (potentially) very long one.

For example, if a Longstring with a potentially large value is being bracketed by the date and time, using a greater-than and less-than symbol as separators, the following:

%long = $date with '>' with %long with ('<' with $time)

will be more efficient than

%long = $date with '>' with %long with '<' with $time