Longstrings: Difference between revisions
No edit summary |
|||
Line 3: | Line 3: | ||
<p class="code">%name is longstring | <p class="code">%name is longstring | ||
</p> | </p> | ||
<var>Longstring</var> | <var>Longstring</var> variables are largely interchangeable with <var>String</var> variables, with the exception that a <var>Longstring</var> can have a length up to 2**31-1 bytes, while <var>String</var> variables have a maximum length of 255 bytes. The <var>Variables Are</var> statement and the <var>VTYPE</var> parameter do not allow <code>Longstring</code> to be set as a default type, so all <var>Longstring</var> variables must be explicitly declared as such. <var>Longstring</var> variables can be defined as <code>Common</code> and as subroutine parameters, but there is currently no support for <code>Static</code> <var>Longstring</var> variables. Sirius Mods Version 7.2 introduced support for the <var>Initial</var> clause for <var>longstrings</var>. | ||
Like other %variables, a <var>Longstring</var> cannot be declared as <code>Global</code> on its declaration. However, a <var>Longstring</var> %variable can be dynamically bound to a global <var>Longstring</var> with the <var>[[$Lstr_Global_and_$Lstr_Session|$Lstr_global]]</var> function, and it can be dynamically bound to a session global <var>Longstring</var> with the <var>[[$Lstr_Global_and_$Lstr_Session|$Lstr_session]]</var> function. | Like other %variables, a <var>Longstring</var> cannot be declared as <code>Global</code> on its declaration. However, a <var>Longstring</var> %variable can be dynamically bound to a global <var>Longstring</var> with the <var>[[$Lstr_Global_and_$Lstr_Session|$Lstr_global]]</var> function, and it can be dynamically bound to a session global <var>Longstring</var> with the <var>[[$Lstr_Global_and_$Lstr_Session|$Lstr_session]]</var> function. |
Revision as of 15:35, 9 September 2013
The Longstring datatype was introduced in Sirius Mods Version 6.2. Longstrings appear as a native Model 204 datatype and are defined in the same way as other variable datatypes:
%name is longstring
Longstring variables are largely interchangeable with String variables, with the exception that a Longstring can have a length up to 2**31-1 bytes, while String variables have a maximum length of 255 bytes. The Variables Are statement and the VTYPE parameter do not allow Longstring
to be set as a default type, so all Longstring variables must be explicitly declared as such. Longstring variables can be defined as Common
and as subroutine parameters, but there is currently no support for Static
Longstring variables. Sirius Mods Version 7.2 introduced support for the Initial clause for longstrings.
Like other %variables, a Longstring cannot be declared as Global
on its declaration. However, a Longstring %variable can be dynamically bound to a global Longstring with the $Lstr_global function, and it can be dynamically bound to a session global Longstring with the $Lstr_session function.
The value of a global or session longstring can also be retrieved with $Lstr_global_get or $Lstr_session_get, and it can be updated with $Lstr_global_set or $Lstr_session_set.
Longstrings can also be declared as arrays:
%heaps is longstring array(10)
The Longstring datatype is not supported inside images. However, Sirius Mods Version 7.2 introduced support for image items with length greater than 255:
image foo bar is string len 300 end image
While such image items can't have arbitrary lengths up to 2**31-1 like other Longstring variables, they exhibit the same behavior as other Longstring variables in request cancellation in the case of truncation, and in upgrading With operations to Longstring With
operations.
While it might be tempting to redefine many or all String Len 255
variables as Longstring, there are a few subtle issues discussed in this chapter that might result in problems should this be done. This is not to say that many such variables shouldn't be converted to Longstring, but it might not be as simple as a one-line editing change.
Truncation
One key difference between a Longstring and a regular String is the default behavior of Longstring truncation: any truncation on assignment from a Longstring, Longstring $function, or Longstring With operation causes request cancellation. Two examples of the application of this rule follow:
- An assignment to a String variable from a Longstring results in request cancellation if the value of the Longstring exceeds the declared String length. This cancellation can happen even if the Longstring is less than 255 bytes long. If, say, variable
%short
were defined asString Len 55
, and a Longstring variable called%long
contained 60 bytes of data, an assignment like the following results in request cancellation:%short = %long
Yet, you can successfully use an intermediate assignment to a
String Len 255
variable (called%medium
in the following example) followed by the assignment of that variable to%short
:%medium = %long %short = %medium
As a result, the last five bytes of the value originally held in
%long
are silently truncated and assigned to%short
.Of course, since a regular String can never be longer than 255 bytes, any assignment from a Longstring longer than 255 bytes to a regular String will result in request cancellation. There are several ways around this problem, but the simplest is to use the $Str function to silently truncate a Longstring at 255 bytes or whatever is required for assignment to its target. Effectively, the $str function tells Model 204 to treat the Longstring as it would a regular String for truncation purposes, and the assignment succeeds:
%short = $str(%long)
- Although the Longstring datatype is not supported inside images, you can assign from a Longstring to an image item. However, assigning to an image item a Longstring variable that has a value that
ends with one or more of the target image item's Pad character (which defaults to the space character) where the target image item is not NoStrip results in an implicit truncation — the trailing pad characters are effectively removed. Since implicit truncation of a Longstring value on assignment is not allowed, this results in request cancellation.
For example, the following request, which prints the result
They're different
, shows the image item truncation for an assignment from a String:begin %str is string len 8 image foo x is string len 8 end image prepare image foo %str = 'Blank ' %foo:x = %str if %foo:x ne %str then print 'They''re different' end if end
If
%str
is declared as a Longstring above, however, the request is cancelled by a Longstring truncation error. But if%str
is declared as a Longstring, and if%foo:x = %str
is replaced by%foo:x = $str(%str)
, the request succeeds.
Using $str to correct for this Longstring truncation behavior is not always appropriate, though. The use of $str might be viewed as a continuation of the dubious Model 204 programming practice of truncation by assignment, so it might be avoided or at least used as a last result as a matter of policy. In fact, converting many String variables to Longstring might be viewed as a way of detecting possible unintentional truncation in existing applications, although there are some subtle issues one should be aware of before embarking on such an enterprise.
For additional discussion of these truncation issues, see Changing Longstring truncation behavior.
Longstrings in expressions
Like Strings, a Longstring variable can be used in User Language expressions, as operands or as input to $functions. In "Sirius Mods" Version 7.2 and later, Longstring variables can also be used as input to intrinsic methods (as can any other string or numeric datatype).
User Language expressions can have embedded sub-expressions or simply expressions. For example, in
%x = %a with %b with %c
the expression %a with %b
is evaluated and an intermediate result is produced. This intermediate result is then used as the first operand in a With
operation with %c
. With no Longstrings involved, string expressions are silently truncated at 255 bytes, including when producing an intermediate result. So, in the above example, if %a
and %b
were each 200 bytes long, the intermediate result of %a with %b
would be truncated at the 55th byte of %b
and the with %c
would simply drop %c
since the intermediate result that was the first operand of the with %c
would already be 255 bytes long. In this case, %x
would end up containing all of %a
, the first 55 bytes of %b
and none of %c
. Fortunately, the results would be the same even if the expression were written
%x = %a with (%b with %c)
though it's worth working this out mentally to develop a good feel for how intermediate expression results are processed in User Language.
In any case, the With operation behaves differently in the presence of Longstrings. Specifically, if either operand of a With
operation is a Longstring, the intermediate result of the operation is also a Longstring. If, in the above example, %a
was a Longstring and %b
and %c
were regular Strings, the result of %a with %b
would be a 400-byte Longstring. When this 400-byte intermediate result Longstring is then concatenated using the With
operation on %c
, the result will be a Longstring of length 400 plus the length of %c
. If the target of this expression, %x
, was a regular String, this would cause a request-cancelling truncation error.
In addition, if the target of a With
operation is a Longstring, the With
operation produces a Longstring result, even if none of the operands are themselves Longstrings. For example, if
%x
is a Longstring, and %a
and %b
are String Len 255
, each with 200 bytes of data:
%x = %a with %b
%x
will be 400 bytes long, containing all of %a
concatenated with all of %b
. If either of the operands of such a With
clause is itself an expression, that expression is treated as if its target were also a Longstring. For example, if in
%x = %a with (%b with %c)
%x
is a Longstring, and %a
, %b
, and %c
are String Len 255
, each with 200 bytes of data, %x
will end up being 600 bytes long, containing all of %a
concatenated with all of %b
with all of %c
. This works the same way if the assignment is written as either of the following:
%x = (%a with %b) with %c %x = %a with %b with %c
Expression processing is the same for string literals, so if %x
is a Longstring, and %a
is a String Len 255
with 255 bytes of data, the following assigns 258 bytes to %x
:
%x = %a with '...'
Another way of looking at this is that in the presence of Longstring variables, whether as the target or as one of the operands, all concatenation operations are "upgraded" to be Longstring concatenations. One side-effect of this is that if an operand of a concatenation is a Longstring, Longstring truncation rules apply to the ultimate target of the assignment. For example, %long
is a Longstring containing 'Testing...'
, and %short
is a String Len 12
:
%short = (%long with '123') with '456'
The result is a request-cancelling truncation error, because the result of all the concatenation operations is treated as a Longstring, albeit one with less than 255 bytes of data. The cancellation can be avoided with the use of the $str function, as in the following:
%short = $str((%long with '123') with '456')
Though, again, this is simply carrying on the dubious User Language programming practice of truncation by assignment.
Note that the "upgrading"; of With
operations to Longstring With
operations is not induced by a Longstring variable or expression inside a $function call. For example, %long
is a Longstring with 30 bytes of data, and %short
is String Len 10
:
%short = '*' with $substr(%long, 1, 20)
%short
ends up containing an asterisk followed by the first 9 bytes of %long
. The assignment is made with silent truncation, because the result of a non-longstring-capable $function is always treated as a regular String for the purposes of assignment and With processing.
In a context where a Longstring is automatically converted to a numeric datatype, a request-cancelling truncation error occurs if the Longstring variable is longer than 255 bytes, even if most or all of these bytes are leading zeros. For example, %long
is a Longstring with 300 zeros followed by a one:
%a = %long + 1
The result is a request-cancelling truncation error. Fortunately, it's not likely that one is likely to encounter numbers with greater than 255 digits in them. Longstring data used in a numeric context will undergo the dubious automatic conversion of invalid numeric data into a zero in the same way as String data. Note: One case of automatic conversion to numeric where String and Longstring behaviors differ is index loop control variables. For example, as of "Sirius Mods" Version 6.7, the following loop is valid if %s
is a String, but it results in a compilation error if %s
is a Longstring:
for %s from 1 to 2 print %s end for
If the result of a numeric operation on a Longstring is then used in a With operation, the With operation is not upgraded to a Longstring With operation, because the intermediate result of the numeric operation is not a Longstring but a numeric, which is then automatically converted to a String intermediate result. For example, %long
is a Longstring containing 99
and %short
is String Len 2
:
%short = %long + 1
The result is not a request cancellation; instead, a M204.0552: VARIABLE TOO SMALL FOR RESULT
message is issued, and an asterisk ( * ) is assigned to %short
. Similarly, with these definitions and values
%short = (%long + 1) with '*'
results in a 10
being assigned to %short
with no warnings, exactly the behavior if %long
were a String Len 255
.
Comparison operations such as Eq
, Lt
, Le
, >
, <
, etc. will perform longstring comparisons if either of the operands is a Longstring, that is, comparison operations involving Longstring operands behave pretty much as expected.
One important point to keep in mind is that Model 204's expression processing behavior is not changed at all unless Longstring variables or $functions are used, and then only changed in the statements where they are actually used. So the effect of any use of Longstring variables or $functions is limited to the statements that use them.
Longstrings and $functions
Longstrings can be used as inputs to $functions. As mentioned before, if a Longstring expression is assigned to a regular String, a request-cancelling truncation error will occur if the target String variable is not big enough to hold the source Longstring. Request-cancelling truncation errors also occur if a Longstring that is longer than 255 bytes is passed to a non-Longstring-capable $function. For example:
print $substr(%long, 1, 50)
would result in request cancellation if %long
was longer than 255 bytes. One way around this would be to use the $str function to tell User Language to treat the Longstring as a String in this case as in:
print $substr($str(%long), 1, 50)
though a better approach in this case would be to use the Longstring-capable sub-stringing function, $Lstr_Substr, as in:
print $lstr_substr(%long, 1, 50)
The Longstring-capable $functions in this manual typically start with "$lstr", end in "_lstr" (such as $ListInf_Lstr), or belong to a family of $functions (such as the $Regex family) that are completely Longstring-capable. Longstring-capable $functions specific to other Sirius products (like the Janus Web Server and Janus Sockets $functions) typically do not use an "lstr" prefix or suffix, but they are identified in their documentation as Longstring-capable.
In addition to their ability to process more than 255-byte long strings, Longstring-capable $functions have some special characteristics pertaining to expression handling:
- A Longstring-capable $function that returns a string result (as opposed to one that returns a numeric result such as $Lstr_Index) is treated as a Longstring expression for the purposes of truncation and for the upgrading of
With
operations to Longstring With operations. For example, if%short
isString Len 5
and%junk
containsSome text
:%short = $lstr_substr(%junk, 1, 7)
would result in a request-cancelling truncation error. This is true whether
%junk
was a Longstring or a regular String, though the latter illustrates the point that regular String variables (or expressions) can be used as input to Longstring-capable $functions. If%junk
contained 300 bytes of data:%out = $lstr_substr(%junk, 1, 255) with '*'
would result in a request-cancelling truncation error if
%out
were a regular String variable, and would result in 256 bytes, the last byte being an asterisk, being assigned to%out
if%out
were a Longstring. - All string arguments to Longstring-capable $functions are treated as Longstring targets for the purpose of upgrading With operations to Longstring With operations. For example, since $Lstr_Right is Longstring-capable, With in its string argument is upgraded to Longstring. So, if
%medium
is a string containing 252 or more characters, then:$lstr_right(%medium with '****', 256)
returns the right-most 252 bytes of
%medium
, concatenated with four asterisks.Note: This behavior does not imply that Longstring-capable $functions will always accept strings longer than 255 bytes as their arguments. For example, $Lstr_Index will not accept strings longer than 255 bytes as its second argument (the string being searched for), and $Lstr_Right and $Lstr_Left won't accept any strings longer than a single byte for their third argument (the pad character). This $function-specific behavior does not affect the treatment of the $function results or arguments as Longstring data for expression handling purposes.
Longstrings and complex subroutines
Complex subroutine parameters, both Input
and Output
(or Inout
, which means the same thing as Output
) can be defined as Longstring, as in either of the following:
subroutine chop(%x is longstring input) subroutine chop(%x is longstring output)
In addition, Longstring variables and expressions can be passed as parameters to complex subroutines. For Output parameters, longstring issues are fairly straightforward. There are two restrictions:
- You cannot pass a Longstring as a parameter to a subroutine that defines the parameter as
String Output
. - You cannot pass a regular String as a parameter to a subroutine that defines the parameter as
Longstring Output
.
For Input parameters, things are somewhat more complex, because:
- Mismatches in String and Longstring datatypes are allowed between passed value and declared parameter.
- Input parameters can actually receive the results of expressions as their inputs.
While for Input parameters, Strings and Longstrings may be passed interchangeably as Longstring and String parameters, subroutine declaration statements (Declare Subroutine
) must exactly match the parameter types on the actual subroutine definitions. That is, given a declaration like this:
declare subroutine tender(longstring)
One cannot later specify the subroutine as
subroutine tender(%mercy is string len 255)
If a Longstring parameter is passed to a subroutine with the parameter defined as String Input
, the request is cancelled if the Longstring value is longer than the length of the String Input parameter (as always, this will happen even if the Longstring value is shorter than 255 bytes). This mimics the behavior of an assignment of a Longstring variable to a regular String variable.
If a Longstring array is passed to a subroutine with the parameter defined as a String Array
, the request is cancelled if any element of the Longstring array is longer than 255 bytes, whether or not that element is ever referenced in the complex subroutine. Outside the functionality issues raised by this limitation, it also suggests an inefficiency in passing a Longstring array to a String parameter: the inefficiency of scanning the array for values longer than 255 bytes. Because of both the functionality and efficiency issues, it is probably best to avoid passing a Longstring array to a String array parameter if at all possible.
Because a String variable or a literal can always fit into a Longstring parameter, there are no truncation or other issues associated with passing String variables and literals as parameters defined as Longstring.
If a call to a complex subroutine contains a With operation for a Longstring parameter, that With operation is “upgraded” to a longstring With
operation, whether or not any of the operands are themselves Longstrings, exactly as if the target of a With operation were a Longstring variable. As everywhere else, a With operation involving a Longstring in a subroutine call will also be upgraded to a Longstring With operation, meaning that no truncation will occur at 255 bytes, and that if the result is longer than the length of the target String parameter, the request will be cancelled.
Changing Longstring truncation behavior
While it is sometimes convenient that Model 204 silently truncates string data on assignment to a variable or intermediate result, it has also been the source of a vast number of incorrect User Language programs. Because of this history and the higher chance of unintentional truncation from a Longstring source, the default behavior for Longstrings is that any truncation on assignment from a Longstring, longstring $function, or longstring With
operation causes request cancellation. This behavior should facilitate “cleaner” and more robust code - where truncation is intended, it is explicitly indicated (for example, with $Lstr_Substr, $Lstr_Left, or $Str).
Nevertheless, since this cancellation on truncation behavior is inconsistent with Model 204's behavior for strings, it might be viewed as undesirable. If you want to prevent request continuation on truncation of a Longstring source in an Online, you can MSGCTL
the error message for Longstring truncation to NOCAN
.
The three messages that you might need to MSGCTL are MSIR.0680, MSIR.0681, and MSIR.0682.
- MSIR.0680 is issued if the SIRFACT system parameter X'01' bit is set, or if the Model 204 DEBUGUL user parameter is set to a non-zero value.
- MSIR.0681 is issued for requests entered at command level rather than run from a procedure.
- MSIR.0682 is issued otherwise.
Issuing MSGCTL
for these messages to NOCAN
might prevent request cancellation from the occasional Longstring truncation, but if silent truncation of Longstrings is heavily used as a programming “technique” inside a request, the user running the request will quickly be restarted with a “TOO MANY ERRORS” message. To prevent this, MSGCTL
the indicated messages to NOCOUNT
.
Even then, a large number of these messages might be viewed as being annoying, at best, if the intent is to simply ignore silent truncation of Longstrings. In that case, MSGCTL
the indicated messages to NOTERM
and maybe even NOAUDIT
(if this latter is available). Even then, there will be a little Model 204 processing overhead in producing the messages that are everywhere suppressed, so it would still generally be more efficient to truncate Longstrings explicitly using $Str, $Lstr_Substr or $Lstr_Left.
If you use the default Longstring behavior, at least in the development and test environments, you should find it will rapidly catch potential problems and so produce more bug-free code. The request cancellation due to longstring truncation should therefore be a benefit. In those places that “truncation by assignment” is used in the code, if you change any of the types in the source expression and discover request cancellation, you will probably decide it is better to use an explicit truncation construct, rather than to retain this dubious coding practice.
If there is concern about request cancellation in a production region, you can MSGCTL
the indicated messages to NOTERM
in production. However, such a switch allows a production request to continue after an unanticipated longstring truncation, so it could result in data corruption or a more subtle error later in the request that will cause request cancellation anyway, but be more difficult to diagnose.
Longstrings and the Print, Html, and Text statements
Using Longstring expressions in the Print
statement works largely “as expected”: given the constraints of LOBUFF
, OUTCCC
, and OUTMRL
, and other output target specific parameters - the values of Longstrings are simply displayed to the output target. One minor exception to this is that the To
and At
clauses on the Print
statement are not supported for Longstrings.
It should also be kept in mind that the With
keyword in Print
statements is not the With
concatenation operator, although the result is usually the same as if it were. Specifically, the With
keyword results in the part before the With
being printed, followed by the part after. This means that if two regular String variables, each with 255 bytes of data in them, are printed as follows:
print %a with %b
510 bytes of data would be printed, which is different from the With
operator in an assignment like the following, which will result in %c
simply containing the contents of %a
, because the With
operation results in truncation at 255 bytes:
%c = %a with %b
This difference between the With
keyword in the Print
statement and the With
operator in expressions predates Longstrings and is, in fact, more significant with regular Strings than with Longstrings.
The HTML and Text statements allow variable values or expression results to be embedded inside the expression start and end characters (defaults: { and }). As with the Print
statement, this works pretty much “as expected” for Longstrings: the contents of the Longstring variable or the result of a Longstring expression will be displayed in their entirety within display parameter constraints. The only Longstring related issue for HTML statement expressions is that if an expression is not a Longstring variable, a longstring $function, or a With
operation involving one or more of these, the expression is assumed to be a regular String expression that undergoes silent truncation at 255 bytes. For example, if %a
and %b
were regular String variables both containing 200 bytes of data, the following would truncate the concatenation of %a
and %b
at 255 bytes:
text data The result is {%a with %b}
To get around this, one can force the With
operation to be upgraded to a Longstring With
operation, using:
text data The result is {$lstr(%a with %b)}
However, the use of With
operations in Html statements is generally silly, since the same result can be obtained by simply entering each operand in the With expression as a separate expression as in:
text data The result is {%a}{%b}
Longstrings and methods
In addition to their use as local variables and as inputs to or outputs from $functions and complex subroutines, Longstrings can, of course, also be used in the object-oriented constructs made available by the Janus SOAP User Language Interface. These uses include:
- As structure or class members.
- As input parameters to both User Language and system methods.
- As the result (output value) of User Language and system methods.
In fact, all system methods are Longstring-capable, so they behave, for the purposes of truncation and upgrading of With operations, the same as Longstring-capable $functions. Therefore any With operation whose result is an input to a system method causes the With operation to be upgraded to a Longstring With operation. Similarly, any implicit truncation of the result of a system method results in request cancellation.
User Language methods, on the other hand, can declare their inputs and output as Longstrings or Strings of a specific length. Longstring inputs and results exhibit the same truncation and With operation behavior as string inputs to system methods. For example, consider the following function declaration in some class:
function encode(%in is longstring) is longstring
If the method is invoked as follows:
%x = %foo:encode(%a with %b)
the %a with %b
is upgraded to a Longstring With operation, because its target (the %in
parameter in the Encode
function) is a Longstring. Similarly, if %x
is a standard String variable with some specific length, the request will be cancelled if the result of the Encode
method is longer than %x
's declared length.
String inputs and output, on the other hand, will behave like standard String variables for the purposes of truncation and With operation behavior. For example, consider the following function declaration in some class:
function stubby(%in is string len 4) is string len 2
If the method is invoked as follows:
%x = %foo:stubby(%a with %b)
the %a with %b
is not upgraded to a Longstring With operation because its target (the %in
parameter in the Stubby
function) is not a Longstring. Of course, if either %a
or %b
is a Longstring, then the With operation will be a Longstring With operation, anyway. If neither %a
nor %b
is a Longstring and %a
contains foo
and %b
contains bar
, the result of the With operation would be foobar
which would be silently truncated to foob
when assigned to the input parameter %in
.
Similarly, if %x
is a String Len 1
, and the Stubby
method returns ok
, the ok
would be silently truncated to o
when assigned to %x
. In fact, if the Stubby
method had the following statement:
return 'Not OK'
the return value would be silently truncated to No
before being assigned to the target variable, even if the target variable for the Stubby
invocation was longer than two bytes. On the other hand, if the Stubby
method had the following statement:
return %schooner
and %schooner
was a Longstring with a value longer than two bytes, the request would be cancelled because of longstring trucation, even if the target variable for the Stubby invocation was, itself, a longstring.
Finally, Sirius Mods Version 7.2 introduced support for intrinsic methods. As for all other system method inputs, intrinsic String system methods all behave as if their method string was a Longstring. For example, in:
%x = (%a with %b):right(40, pad='*')
the %a with %b
would be upgraded to a Longstring With, even if neither %a
nor %b
were a Longstring.
For intrinsic and other methods, the fact that all string inputs are treated as Longstrings does not mean that the method will necessarily accept arbitrarily long values. In fact, it's quite possible for a parameter to be restricted to being a single character. For example the intrinsic String Right method has a named parameter called Pad that cannot be longer than one byte:
%y = %x:right(50, pad=%pad)
In this example, if %pad
had a value longer than a single byte, the request would be cancelled. This, in spite of the fact that the parameter behaves like a Longstring parameter.
Longstring performance
The first 255 bytes of Longstrings are always kept in STBL, so the code path for manipulating a Longstring variable with a value that is shorter than 256 bytes is usually identical to or only slightly greater than the code path for manipulating a regular String variable. A Longstring variable always has 257 bytes of STBL allocated for it at compile time, and it requires somewhat more VTBL space than a regular String variable. Longstring arrays require 257 bytes of STBL per element and some VTBL space per element. This is unlike regular String variables, which require no per-element VTBL space.
Yet because of the minor code path issues and the table space issues just mentioned, it is probably not a good idea to use Longstring variables in contexts where the values are never expected to exceed 255 bytes, unless performance is not a major concern, or unless the extra error detection for Longstring truncation is desired.
Of course, variables that need to hold more than 255 bytes of data must be declared as Longstrings, and any data beyond 255 bytes gets stored in CCATEMP. This means manipulation of very long Longstring variables could result in significant logical and even physical CCATEMP I/O and higher CCATEMP utilization. In addition, very long Longstring values means large quantities of data need to be scanned or copied, which in itself could be a source of CPU overhead. This is not to say that long values should not be used in Longstrings in applications; quite the contrary. Longstrings are designed for applications that require long values, and the performance of Longstring manipulation, even for very long values, will generally be pretty good.
Nevertheless, it is a good idea to avoid unnecessary, very long, Longstring operations - unnecessary because the application does not require it, or because the operation has already been performed once. Regarding the latter, if a very long Longstring operation occurs in a loop, it would be better to move the operation outside the loop if possible, or to only do it conditionally if it's really required and hasn't already been performed in a previous iteration of the loop.
There is relatively little space overhead for the part of a Longstring that resides in CCATEMP - 6124 of the 6144 bytes on each CCATEMP page actually hold data. So the first 255 bytes of a 60,000 byte long Longstring value are stored in STBL, and the remaining 60,000-255 bytes are stored on (60,000-255)/6124, or 10 CCATEMP pages. Intermediate results will also use some CCATEMP space, though this usage will typically be short-lived - the space being released as soon as the statement completes. So, for example, if %a
and %b
are Longstring variables each with 90,000 bytes of data, and %c
is a Longstring variable, the following statement will temporarily require an extra 120,000 bytes of space (255 of them in STBL) to hold the result of the %a with %b
operation:
%c = $lstr_substr(%a with %b, 60000, 60000)
Because concatenation of one string to another is such a common operation, assignment of the concatenation of a Longstring variable and another string to the first Longstring variable is highly optimized. For example, if %long
is a longstring with 50,000 bytes of data:
%long = %long with '!'
would simply tack an exclamation mark on the end of %long
rather than copying all of %long
and an exclamation mark and then assigning that string to %long
. Note, however, that this optimization is only performed if a single string is being concatenated with the current value of the target variable. That is, in the following:
%long = %long with '+' with $time
an intermediate Longstring containing the concatenation of %long
and a plus sign will be created. That intermediate Longstring will then be concatenated with the current time (as returned by $time) and then assigned to %long
. This means that the current contents of %long
end up being copied twice in such a case which, if %long
contains 50,000 bytes, is 100,000 bytes worth of data movement which will be quite expensive, by any standard. Fortunately, it is easy to “help out” the compiler to make this operation more efficient:
%long = %long with ('+' with $time)
In this case, the concatenation of the plus sign and the current time are assigned to an intermediate work Longstring. Then, because this intermediate value is simply being concatenated with %long
and then assigned back to %long
, the concatenation optimization results in the intermediate work Longstring simply being tacked on to the end of %long
, requiring almost no data movement, at all. Even in cases where the concatenation can't be optimized to an append operation, it is usually a good idea to isolate concatenations involving relatively small values from a preceding one involving a (potentially) very long one.
For example, if a longstring with a potentially large value is being bracketed by the date and time, using a greater-than and less-than symbol as separators, the following:
%long = $date with '>' with %long with ('<' with $time)
will be more efficient than
%long = $date with '>' with %long with '<' with $time