Intrinsic classes: Difference between revisions
mNo edit summary |
m (change table class) |
||
(29 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
You may want to go directly to: | |||
*[[List of | *[[List of Float methods]] | ||
*[[List of | *[[List of String methods]] | ||
*[[List of | *[[List of Unicode methods]] | ||
*[[List of Intrinsic methods]] | |||
===Definition of Intrinsic classes=== | ===Definition of Intrinsic classes=== | ||
In | In “pure” [http://en.wikipedia.org/wiki/Object-oriented_programming object-oriented languages], all datatypes are objects, that is, all datatypes are | ||
extension classes of some base object class. This means that even things that one | extension classes of some base object class. This means that even things that one | ||
might not immediately think of as objects, such as character strings or numbers, are, in | might not immediately think of as objects, such as character strings or numbers, are, in | ||
fact, considered to be objects. | fact, considered to be objects. | ||
There are two aspects to the assertion that strings or numbers are objects: | There are two aspects to the assertion that strings or numbers are objects: | ||
*They are internally managed in exactly the same was as any other objects. That is, a string or numeric variable is actually a reference to an object that contains the string or numeric value, rather than itself directly containing the string or numeric value. | *They are internally managed in exactly the same was as any other objects. That is, a string or numeric variable is actually a reference to an object that contains the string or numeric value, rather than itself directly containing the string or numeric value. | ||
*They are syntactically identical to other objects. That is, the syntax for manipulating strings or numbers is the same as the syntax for manipulating other objects. More specifically, methods are applied to strings or numbers using the exact same syntax as when methods are applied to other objects. | *They are syntactically identical to other objects. That is, the syntax for manipulating strings or numbers is the same as the syntax for manipulating other objects. More specifically, methods are applied to strings or numbers using the exact same syntax as when methods are applied to other objects. | ||
Since the first aspect is concerned with the internal management of strings or numbers, | Since the first aspect is concerned with the internal management of strings or numbers, | ||
it is largely irrelevant from the perspective of a programmer using a | it is largely irrelevant from the perspective of a programmer using a “pure” object- | ||
oriented language. In fact, many languages that profess to be | oriented language. In fact, many languages that profess to be “pure” actually “cheat,” | ||
internally, and they special-case string and numeric data. They do so because string | internally, and they special-case string and numeric data. They do so because string | ||
and numeric data is so important, and so heavily used in all programming languages, | and numeric data is so important, and so heavily used in all programming languages, | ||
that insisting on not treating it specially (internally) is pedantry that has a cost in | that insisting on not treating it specially (internally) is pedantry that has a cost in | ||
performance. To see why, consider a statement that adds two numbers together (the | performance. To see why, consider a statement that adds two numbers together (the | ||
language for the example is, of course, | language for the example is, of course, <var class="product">SOUL</var>): | ||
<p class="code">%x is float | |||
%y is float | |||
... | |||
%x = %y + 13 </p> | |||
Insisting that numbers are no different from any other object means that this statement | Insisting that numbers are no different from any other object means that this statement | ||
must get the value referenced by %y, add it to the number referenced by the literal | must get the value referenced by <code>%y</code>, add it to the number referenced by the literal <code>13</code>, | ||
create a new Float instance and set %x to reference that new instance. Clearly, this | create a new <var>Float</var> instance and set <code>%x</code> to reference that new instance. Clearly, this | ||
would be less efficient than taking the number in %y, adding 13 to it and setting %x to | would be less efficient than taking the number in <code>%y</code>, adding 13 to it, and setting <code>%x</code> to | ||
the result. Fortunately, | the result. Fortunately, the way this is actually done is purely an under-the-covers | ||
implementation issue, so one can always pretend that | implementation issue, so one can always pretend that “pure” object-oriented processing | ||
is being used. | is being used. | ||
This is so because numbers and strings are an example of a special class of objects | This is so because numbers and strings are an example of a special class of objects | ||
called immutable objects. Immutable objects, as the name suggests, cannot be | called '''immutable''' objects. Immutable objects, as the name suggests, cannot be | ||
changed once they are created. To illustrate the difference between immutable objects | changed once they are created. | ||
To illustrate the difference between immutable objects | |||
and mutable objects, consider two variables: | and mutable objects, consider two variables: | ||
<p class="code">%count is float | |||
%account is object bankAccount </p> | |||
If %account were set to a new object instance, it would not be surprising if, after it's set, | |||
If <code>%account</code> were set to a new object instance, it would not be surprising if, after it's set, | |||
the object changed. For example, it wouldn't be surprising if the account balance for | the object changed. For example, it wouldn't be surprising if the account balance for | ||
%account changed, even if the change was done via a different variable: | <code>%account</code> changed, even if the change was done via a different variable: | ||
<p class="code">print %account:balance | |||
%myAccount = %account | |||
%myAccount:addToBalance(23) | |||
print %account:balance </p> | |||
In this example, it is expected that the balance displayed in the second Print statement | |||
In this example, it is expected that the balance displayed in the second <code>Print</code> statement | |||
will be 23 greater than that displayed by the first. | will be 23 greater than that displayed by the first. | ||
On the other hand, given the following: | On the other hand, given the following: | ||
<p class="code">print %count | |||
%myCount = %count | |||
%myCount:addToValue(23) | |||
print %count </p> | |||
It would be surprising for some operation on %myCount to have changed the value in | |||
%count. In general, for variables that contain numeric or string values, one would only | It would be surprising for some operation on <code>%myCount</code> to have changed the value in | ||
<code>%count</code>. In general, for variables that contain numeric or string values, one would only | |||
expect the value to change when a new value is assigned to it. In fact, one would not | expect the value to change when a new value is assigned to it. In fact, one would not | ||
expect to see methods like AddToValue that modify the method object for a numeric | expect to see methods like <code>AddToValue</code> that modify the method object for a numeric | ||
datatype. | datatype. | ||
More typical for a numeric datatype would be a method that manipulates the value and | More typical for a numeric datatype would be a method that manipulates the value and | ||
produces a new value (object): | produces a new value (object): | ||
<p class="code">%myCount = %myCount:addToValue(23) </p> | |||
Although even pure object-oriented languages have special syntax for the common | Although even pure object-oriented languages have special syntax for the common | ||
operation of addition: | operation of addition: | ||
<p class="code">%myCount = %myCount + 23 </p> | |||
Absence of methods that modify a value is what make a class immutable, and in | |||
Absence of methods that modify a value is what make a class immutable, and in “pure” | |||
object-oriented languages, basic string and numeric datatypes are always immutable. | object-oriented languages, basic string and numeric datatypes are always immutable. | ||
= | |||
<div id="Intrinsic methods in User Language"></div> | |||
==Intrinsic methods in SOUL== | |||
<var class="product">SOUL</var> is a legacy programming language for which object-oriented capabilities | |||
are a relatively recent addition. So, one might expect that strings and numbers are not | are a relatively recent addition. So, one might expect that strings and numbers are not | ||
maintained internally as objects, since their existence pre-dates objects. This is in fact | maintained internally as objects, since their existence pre-dates objects. This is in fact | ||
the case, though as was shown, this is largely irrelevant from a | the case, though as was shown, this is largely irrelevant from a <var class="product">SOUL</var> | ||
programmer's perspective, | programmer's perspective, and even if object-oriented capabilities were present in <var class="product">SOUL</var> from its inception, strings and numbers might have been special-cased for | ||
efficiency anyway. | efficiency anyway. | ||
<table class="thJustBold"> | |||
<tr><th>Float</th> | |||
<td>See [[List of Float methods]]</td></tr> | |||
< | <tr><th>String</th> | ||
< | <td>See [[List of String methods]]</td></tr> | ||
< | |||
< | |||
</ | |||
= | <tr><th>Unicode</th> | ||
Beyond the internal representation of strings and numbers, a distinction between | <td>See [[List of Unicode methods]]</td></tr> | ||
</table> | |||
<div id="The benefits of object-oriented syntax in User Language"></div> | |||
==The benefits of object-oriented syntax in SOUL== | |||
Beyond the internal representation of strings and numbers, a distinction between <var class="product">SOUL</var> and pure object-oriented languages was that the <var class="term">object</var>:<var class="term">method</var> syntax | |||
was not used to manipulate strings and numbers; instead, strings and numbers were | was not used to manipulate strings and numbers; instead, strings and numbers were | ||
manipulated via $functions. | manipulated via $functions. This was changed to allow the <var class="term">object</var>:<var class="term">method</var> | ||
syntax against string and numeric variables and constants. | syntax against string and numeric variables and constants. | ||
The following code fragment illustrates how the same operation can be accomplished | |||
traditional $functions and | The following code fragment illustrates how the same operation can be accomplished with | ||
traditional $functions and with object-oriented syntax: | |||
<p class="code">%name is string len 32 | |||
... | |||
print $len(%name) | |||
print %name:length </p> | |||
While this might not seem very significant, it provides considerable value: | While this might not seem very significant, it provides considerable value: | ||
<ul> | |||
<li>It allows <var class="product">SOUL</var> to be used as a “pure” object-oriented language. This might be especially appealing to programmers who were trained in a pure object-oriented language. | |||
<li>It provides the benefit that expressions can generally be read in the natural left-to-right manner rather than the inside-to-outside manner required to understand an expression coded with $functions. | |||
<li>It provides capabilities to methods that operate on strings or numbers that are not available with $functions. These include support for named parameters and the ability to take objects as input parameters and to produce objects as results. While it would have been possible to extend $functions to have this same functionality (much as Sirius Mods provided [[Calling Sirius Mods $functions|callable]] $function support), it makes more sense to provide it using true object-oriented syntax. Given that this has now been done, it is unlikely that these capabilities will ever be added to $functions. | |||
</ul> | |||
==Two generic intrinsic classes: string and numeric== | ==Two generic intrinsic classes: string and numeric== | ||
<var class="product">SOUL</var> traditionally allowed declarations of many different datatypes. These | |||
included Strings of a specified length with a possible DP value, Fixed numerics (also | included Strings of a specified length with a possible DP value, Fixed numerics (also | ||
with a possible DP), and Float numerics. In addition, Sirius Mods provided support for | with a possible DP), and Float numerics. In addition, <var class="product">Sirius Mods</var> provided support for | ||
Longstring datatypes. And images provide support for a variety of additional datatypes, | <var>[[Longstrings|Longstring]]</var> datatypes. And images provide support for a variety of additional datatypes, | ||
including packed and zoned formats. Essentially, however, there are two categories of | including packed and zoned formats. | ||
datatypes in | |||
Essentially, however, there are two categories of | |||
datatypes in <var class="product">SOUL</var>: string and numeric. In <var class="product">SOUL</var> one can easily | |||
assign values from one datatype to another, even between numeric and string types. | assign values from one datatype to another, even between numeric and string types. | ||
This is intuitive and useful. Typically, in the | This is intuitive and useful. Typically, in the “real world” one doesn't distinguish between | ||
the string representation of a number and its numeric value used for calculation. | the string representation of a number and its numeric value used for calculation. | ||
Similarly, except for performance purposes and, perhaps, value limits, one is typically | Similarly, except for performance purposes and, perhaps, value limits, one is typically | ||
not concerned about how a value is stored internally. | not concerned about how a value is stored internally. | ||
This is not the case in some | This is not the case in some “pure” object-oriented languages where the paradigm of | ||
strong-datatyping is extended to numeric and string datatypes. This means that if one | strong-datatyping is extended to numeric and string datatypes. This means that if one | ||
wants to display the value of a numeric variable, one must explicitly convert it to a string | wants to display the value of a numeric variable, one must explicitly convert it to a string | ||
Line 122: | Line 149: | ||
reinforced by the fact that the general industry trend is away from strong-datatyping for | reinforced by the fact that the general industry trend is away from strong-datatyping for | ||
strings and numbers and toward implicit conversion between these types. | strings and numbers and toward implicit conversion between these types. | ||
It is worth pointing out, however, that loose-datatyping between strings and numbers | It is worth pointing out, however, that loose-datatyping between strings and numbers | ||
does not imply loose-datatyping between strings and numbers and other classes. That | does not imply loose-datatyping between strings and numbers and other classes. That | ||
is, there are generally no implicit conversions of strings or numbers to or from non-string, | is, there are generally no implicit conversions of strings or numbers to or from non-string, non-numeric objects. | ||
non-numeric objects. | |||
Because loose-datatyping and the existing <var class="product">SOUL</var> datatypes work quite well in | |||
Because loose-datatyping and the existing | facilitating rapid development of reasonably tight and efficient code, support for methods against string and numeric types does not add any new | ||
facilitating rapid development of reasonably tight and efficient code, | datatypes to <var class="product">SOUL</var>. And, because of loose-datatyping, the numeric and string | ||
methods can be applied to any of the standard <var class="product">SOUL</var> string and numeric | |||
datatypes to | datatypes. Because they apply to these intrinsic <var class="product">SOUL</var> datatypes, these | ||
methods can be applied to any of the standard | methods are called '''intrinsic''' methods. | ||
datatypes. Because they apply to these intrinsic | |||
methods are called intrinsic methods. | |||
Intrinsic methods can be further classified into these subsets: | Intrinsic methods can be further classified into these subsets: | ||
< | <table class="thJustBold"> | ||
< | <tr><th>Float</th> | ||
< | <td>Methods that perform numeric manipulation on values in <var class="product">Model 204</var>'s <var>Float</var> format.</td></tr> | ||
< | |||
< | <tr><th>Fixed</th> | ||
</ | <td>Methods that perform numeric manipulation on values in <var class="product">Model 204</var>'s <var>Fixed</var> format (with an assumed DP of 0).</td></tr> | ||
The Float intrinsic class can really be thought of as a generic numeric class, as it is a | <tr><th>String</th> | ||
<td>Methods that perform string manipulation, with <var>Longstring</var> capability assumed.</td></tr> | |||
<tr><th>Unicode</th><td>Methods that perform Unicode string manipulation, return Unicode results, or are based on the [[Unicode#Support for the ASCII subset of Unicode|Unicode tables]].</td></tr> | |||
</table> | |||
The <var>Float</var> intrinsic class can really be thought of as a generic numeric class, as it is a | |||
convenient way of representing both integer and decimal data. This is different from | convenient way of representing both integer and decimal data. This is different from | ||
most other programming languages where a Float datatype suffers from inconvenient | most other programming languages where a Float datatype suffers from inconvenient | ||
behavior, especially for decimal numbers. | behavior, especially for decimal numbers. | ||
The Float class is so convenient that all numeric parameters to system methods and | The <var>Float</var> class is so convenient that all numeric parameters to system methods and | ||
$functions are treated as Float parameters. Also, because of the convenience of the | $functions are treated as <var>Float</var> parameters. Also, because of the convenience of the | ||
Float class, there are currently no methods in the Fixed class. | <var>Float</var> class, there are currently no methods in the <var>Fixed</var> class. | ||
As such, the intrinsic methods can be thought of broadly as belonging to two intrinsic | As such, the intrinsic methods can be thought of broadly as belonging to two intrinsic | ||
classes: the Numeric class and the String class. These classes behave largely as if their | classes: the Numeric class and the String class. These classes behave largely as if their | ||
inputs were | inputs were <var class="product">SOUL</var> <var>Float</var> and <var>Longstring</var> datatypes, respectively. That is, for all | ||
intents and purposes, the following are true: | intents and purposes, the following are true: | ||
<ul> | |||
<li>Any non-<var>Float</var> values, including <var>Unicode</var>, are converted to <var>Float</var> values before being processed by a Numeric method. This includes the conversion of any non-numeric strings into 0 that occurs in other <var class="product">SOUL</var> contexts that take a numeric input. | |||
<li>For the purposes of truncation and the <var>With</var> operation, all <var>String</var> method string inputs or outputs behave as <var>Longstrings</var>. | |||
</ul> | |||
==Strings and numbers as method objects== | ==Strings and numbers as method objects== | ||
The term | The term "method object" (or "intrinsic method object") is used for | ||
the value or variable to which an intrinsic method is applied, | the value or variable to which an intrinsic method is applied, | ||
even though the value or variable isn't really an object. | even though the value or variable isn't really an object. | ||
This is justified because, as noted above, strings and numbers can be considered | This is justified because, as noted above, strings and numbers can be considered | ||
immutable objects, regardless of their history or internal representation. | immutable objects, regardless of their history or internal representation. | ||
For example, in the following case, | For example, in the following case, | ||
< | <code>%str</code> is the intrinsic method object for the <code>Length</code> method: | ||
< | <p class="code">%str is string len 32 | ||
... | |||
print %str:Length | |||
</p> | |||
</ | |||
Intrinsic methods can be applied to constants, in addition to variables. | |||
Intrinsic methods can be applied to constants, in addition to variables. | For example, the following assigns the length of the string literal | ||
For example, the following assigns the length of the string literal | <code>Whatever</code> to <code>%x</code>: | ||
<p class="code">%x = 'Whatever':length | |||
< | </p> | ||
</ | Intrinsic methods can take the output from another method, | ||
Intrinsic methods can take the output from another method, | intrinsic or not, as its input. | ||
intrinsic or not, as its input. | For example, the following uses the <var>Right</var> method (which gets the | ||
For example, the following uses the Right method (which gets the | rightmost characters of a string) against the output of a <var>Stringlist</var> | ||
rightmost characters of a string) against the output of a Stringlist | Item method: | ||
Item method: | <p class="code">%list is object stringlist | ||
< | ... | ||
%value = %list:item(%i):right(8) | |||
</p> | |||
</ | Since it is unnecessary to explicitly specify the <var>Item</var> method for | ||
Since it is unnecessary to explicitly specify the Item method for | a <var>Stringlist</var>, the above can also be written as: | ||
a Stringlist, the above can also be written as: | <p class="code">%list is object stringlist | ||
< | ... | ||
%value = %list(%i):right(8) | |||
</p> | |||
</ | Intrinsic methods can also be applied to <var class="product">SOUL</var> <var>Image</var> and <var>Screen</var> items: | ||
Intrinsic methods can also be applied to | <p class="code">image michigan | ||
< | romulus is string len 32 | ||
... | |||
end image | |||
... | |||
%value = %michigan:romulus:right(8) | |||
</p> | |||
</ | And Intrinsic methods can be applied to the output from a $function: | ||
And Intrinsic methods can be applied to the output from a $function: | <p class="code">%seconds = $time:right(2) | ||
< | </p> | ||
</ | Intrinsic methods can even be applied to the results of expressions: | ||
Intrinsic methods can even be applied to the results of expressions: | <p class="code"> ... | ||
< | %value = (%tweedledum with %tweedledee):right(8) | ||
</p> | |||
</ | ==Automatic "implicit" conversion of intrinsic values== | ||
==Automatic | |||
As with other methods, the colon that separates the intrinsic method object from the method | As with other methods, the colon that separates the intrinsic method object from the method | ||
name can be optionally preceded or followed by spaces. This could be done to enhance | name can be optionally preceded or followed by spaces. This could be done to enhance | ||
readability, or even to split a long line. The following four statements are all equivalent: | readability, or even to split a long line. The following four statements are all equivalent: | ||
<p class="code">%value = %list:item(%i):right(8) | |||
%value = %list: item(%i): right(8) | |||
%value = %list :item(%i) :right(8) | |||
%value = %list : item(%i) : - | |||
right(8) </p> | |||
As with most other uses of intrinsic variables or values, if the method object is of a | As with most other uses of intrinsic variables or values, if the method object is of a | ||
different type than the datatype on which the method operates, the input value is | different type than the datatype on which the method operates, the input value is | ||
automatically converted into the target datatype. For example, the expression 7/3 | automatically converted into the target datatype. For example, the expression <code>7/3</code> | ||
clearly produces a numeric value, but the Left method operates on strings. So, in the | clearly produces a numeric value, but the <var>Left</var> method operates on strings. So, in the following statement: | ||
statement: | <p class="code">%i = (7/3):left(4) </p> | ||
The result of the division is converted into a string and then passed to <var>Left</var>, which | |||
produces 2.33 as its result. | produces <code>2.33</code> as its result. | ||
Because of this automatic conversion, the specific class (String, Float, Unicode) of an | |||
Because of this automatic conversion, the specific class (<var>String</var>, <var>Float</var>, <var>Unicode</var>) of an | |||
intrinsic method cannot be determined from the datatype of its method object. This | intrinsic method cannot be determined from the datatype of its method object. This | ||
means that the class must be determined only from the name of the method, which | means that the class must be determined only from the name of the method, which | ||
means that method names must be unique among all intrinsic classes (that is, among | means that method names must be unique among all intrinsic classes (that is, among | ||
String, Float, and Unicode). For example, there may not be a Length method in both the | <var>String</var>, <var>Float</var>, and <var>Unicode</var>). For example, there may not be a <var>Length</var> method in both the | ||
String and Float intrinsic classes. In the case of Length, the method is an intrinsic String | <var>String</var> and <var>Float</var> intrinsic classes. In the case of <var>Length</var>, the method is an intrinsic <var>String</var> | ||
method, there is no comparable Float method, and UnicodeLength is the comparable | method, there is no comparable <var>Float</var> method, and <var>UnicodeLength</var> is the comparable | ||
Unicode method. | <var>Unicode</var> method. | ||
'''Note:''' Because some EBCDIC characters are not translatable to Unicode, and vice | <p class="note">'''Note:''' Because some EBCDIC characters are not translatable to Unicode, and vice | ||
versa, automatic conversions involving Unicode values may cause request cancellation. | versa, automatic conversions involving Unicode values may cause request cancellation. | ||
The explicit intrinsic conversion methods (like EbcdicToUnicode and EbcdicToAscii and | The explicit intrinsic conversion methods (like <var>EbcdicToUnicode</var> and <var>EbcdicToAscii</var> and | ||
their counterparts UnicodeToEbcdic and AsciiToEbcdic), however, have a parameter (as | their counterparts <var>UnicodeToEbcdic</var> and <var>AsciiToEbcdic</var>), however, have a parameter (as | ||
of Sirius Mods version 7.6) that lets you decode or encode untranslatable characters. | of <var class="product">Sirius Mods</var> version 7.6) that lets you decode or encode untranslatable characters. </p> | ||
==Intrinsic method syntax: special cases== | ==Intrinsic method syntax: special cases== | ||
<var class="product">SOUL</var> is a legacy programming language that supports some unusual syntax. | |||
Some of this syntax causes problems for the use of intrinsic methods, | Some of this syntax causes problems for the use of intrinsic methods, | ||
as is described for the following three cases: | as is described for the following three cases: | ||
===Intrinsic methods against database field names=== | ===Intrinsic methods against database field names=== | ||
Field names have very loose naming rules, and names can contain colons | Field names have very loose naming rules, and names can contain colons | ||
Line 257: | Line 292: | ||
Because of this, a field name must be contained inside of parentheses | Because of this, a field name must be contained inside of parentheses | ||
to be used as a method object. | to be used as a method object. | ||
For example, the field < | For example, the field <code>big fat greek field</code> | ||
is used as the input to the intrinsic [[Substring (String function)|Substring]] method: | is used as the input to the intrinsic <var>[[Substring (String function)|Substring]]</var> method: | ||
< | <p class="code">for each record in %recordset | ||
... | |||
%value = (big fat greek field):substring(2, 3) | |||
... | |||
end for | |||
</p> | |||
</ | |||
===Intrinsic methods against percent variables and images that have the same name=== | ===Intrinsic methods against percent variables and images that have the same name=== | ||
<var class="product">SOUL</var> allows one to declare images and percent variables with the same name | |||
in the same scope. | in the same scope. | ||
That is, inside a method, you can declare an | That is, inside a method, you can declare an <var>Image</var> called <code>holland</code> | ||
and a %variable called < | and a %variable called <code>%holland</code>. | ||
References to items in the image and to the percent variable both start | References to items in the image and to the percent variable both start | ||
with < | with <code>%holland</code>: | ||
< | <p class="code">image holland | ||
... | |||
factory is float | |||
... | |||
end image | |||
... | |||
%holland is string len 10 | |||
... | |||
%a = %holland:factory | |||
... | |||
%a = %holland | |||
</p> | |||
</ | |||
Historically, this has not been a problem because a percent variable | Historically, this has not been a problem because a percent variable | ||
would never be followed by a colon. | would never be followed by a colon. | ||
However, with intrinsic methods, this is no longer the case. | However, with intrinsic methods, this is no longer the case. | ||
If the Holland image contained an item called Length, it would | If the <code>Holland</code> image contained an item called <code>Length</code>, it would | ||
be impossible to tell whether < | be impossible to tell whether <code>%holland:length</code> referred | ||
to the Length item in the Holland | to the <code>Length</code> item in the <code>Holland</code> <var>Image</var> or the intrinsic <var>Length</var> | ||
method applied to %holland. | method applied to <code>%holland</code>. | ||
Because of this ambiguity, intrinsic methods cannot be used against percent variables | Because of this ambiguity, intrinsic methods cannot be used against percent variables | ||
without a blank between the variable name and the colon if there is an | without a blank between the variable name and the colon if there is an <var>Image</var> | ||
with the same name as the percent variable. | with the same name as the percent variable. | ||
This is true regardless of whether or not there is an actual conflict between | This is true regardless of whether or not there is an actual conflict between | ||
Line 302: | Line 336: | ||
So, using the above example, to apply the length method to | So, using the above example, to apply the length method to | ||
< | <code>%holland</code>, you must do one of the following: | ||
< | <p class="code">%a = %holland :length | ||
%a = %holland : length | |||
</p> | |||
</ | |||
Specifying < | Specifying <code>%holland: length</code> would not work. | ||
A space after the variable name indicates that the reference | A space after the variable name indicates that the reference | ||
is ''not'' to the | is ''not'' to the <var>Image</var>, because <var>Image</var> item references do not allow any | ||
blanks between the | blanks between the <var>Image</var> name, colon, and <var>Image</var> item name. | ||
You can also simply wrap the variable name in parentheses: | You can also simply wrap the variable name in parentheses: | ||
< | <p class="code">%a = (%holland):length | ||
</p> | |||
</ | |||
Of course, the best solution is to avoid using the same name for images | Of course, the best solution is to avoid using the same name for images | ||
and percent variables, as this is generally somewhat confusing, | and percent variables, as this is generally somewhat confusing, anyway. | ||
anyway. | |||
<div id="printtext"></div> | <div id="printtext"></div> | ||
===Intrinsic methods in a Print, Audit, or Trace statement=== | ===Intrinsic methods in a Print, Audit, or Trace statement=== | ||
The | The <var class="product">SOUL</var> <var>Print</var>, <var>Audit</var>, and <var>Trace</var> statements all use somewhat unusual syntax. | ||
syntax. | |||
While initially these statements appear to operate | While initially these statements appear to operate | ||
on expressions, just like an assignment, this is not really the case. | on expressions, just like an assignment, this is not really the case. | ||
For example, the following statement gets a compilation error: | For example, the following statement gets a compilation error: | ||
< | <p class="code">print 3*4 | ||
</p> | |||
</ | |||
Because of this, one has to be careful using intrinsic methods in | Because of this, one has to be careful using intrinsic methods in | ||
a Print, Audit, or Trace statement. The general recommendation is to use | a <var>Print</var>, <var>Audit</var>, or <var>Trace</var> statement. The general recommendation is to use <var>PrintText</var>, | ||
AuditText, or TraceText, as described below. | <var>AuditText</var>, or <var>TraceText</var>, as described below. | ||
There are several specific syntax problems with Print, Audit, and Trace: | There are several specific syntax problems with <var>Print</var>, <var>Audit</var>, and <var>Trace</var>: | ||
<ul> | <ul> | ||
<li>Because blanks are treated specially in these statements, you | <li>Because blanks are treated specially in these statements, you | ||
may not put a blank before the colon in an intrinsic method invocation. | may not put a blank before the colon in an intrinsic method invocation. | ||
That is, the following is '''''incorrect''''': | That is, the following is '''''incorrect''''': | ||
< | <p class="code">print %x :length | ||
</p> | |||
</ | |||
But the following two statements are allowed: | But the following two statements are allowed: | ||
< | <p class="code">print %x:length | ||
print %x: length | |||
</p> | |||
</ | |||
<li>String and literal constants are treated specially by these | <li>String and literal constants are treated specially by these | ||
statements, so you cannot issue methods against constants in a Print, Audit, | statements, so you cannot issue methods against constants in a <var>Print</var>, <var>Audit</var>, | ||
or Trace statement. | or <var>Trace</var> statement. | ||
That is, the following are '''''incorrect''''': | That is, the following are '''''incorrect''''': | ||
< | <p class="code">print 'Foobar':length | ||
print 22:squareRoot | |||
</p> | |||
</ | |||
<li>Although you might be tempted to use parentheses to get around some of | <li>Although you might be tempted to use parentheses to get around some of | ||
these issues, leading parentheses are '''not''' allowed | these issues, leading parentheses are '''not''' allowed | ||
in a Print, Audit, or Trace statement token. | in a <var>Print</var>, <var>Audit</var>, or <var>Trace</var> statement token. | ||
That is, the following is '''''incorrect''''': | That is, the following is '''''incorrect''''': | ||
< | <p class="code">print (fieldname):length | ||
</p> | |||
</ | |||
Coupled with the fact that applying intrinsic methods to fields | Coupled with the fact that applying intrinsic methods to fields | ||
generally requires use of parentheses, this means that you '''cannot''' display | generally requires use of parentheses, this means that you '''cannot''' display | ||
the result of an intrinsic method applied to a field with the Print, | the result of an intrinsic method applied to a field with the <var>Print</var>, | ||
Audit, and Trace methods. | <var>Audit</var>, and <var>Trace</var> methods. | ||
</ul> | </ul> | ||
Fortunately | Fortunately, the newer '''PrintText''', '''AuditText''', | ||
and '''TraceText''' statements (see [[Targeted Text statements]]) are direct analogs of <var>Print</var>, <var>Audit</var>, and <var>Trace</var>, respectively, but they use | |||
and '''TraceText''' statements ([[Targeted Text statements]]) | |||
a more consistent syntax. | a more consistent syntax. | ||
Specifically, the newer statements treat everything as literal text, except | Specifically, the newer statements treat everything as literal text, except | ||
for that which is enclosed | for that which is enclosed within the expression start and end characters (which default to curly braces: <code>{...}</code>), | ||
which is treated as a standard | which is treated as a standard <var class="product">SOUL</var> expression. | ||
This means that the syntax used for variable parts of PrintText, AuditText, | This means that the syntax used for variable parts of <var>PrintText</var>, <var>AuditText</var>, | ||
and Tracetext statements is identical to the syntax allowed on the right | and <var>Tracetext</var> statements is identical to the syntax allowed on the right | ||
side of a variable assignment. | side of a variable assignment. | ||
So, the following is valid: | So, the following is valid: | ||
< | <p class="code">printText {3*4} | ||
</p> | |||
</ | |||
And this is valid: | And this is valid: | ||
< | <p class="code">printText {%x :length} | ||
</p> | |||
</ | |||
These are valid statements: | These are valid statements: | ||
< | <p class="code">printText {'Foobar':length} | ||
printText {22:squareRoot} | |||
</p> | |||
</ | |||
And this is valid: | And this is valid: | ||
< | <p class="code">printText {(fieldname):length} | ||
</p> | |||
</ | |||
The [[Text and Html statements|Text | The <var>[[Text and Html statements|Text]]</var> statement | ||
also provides functionality comparable to the PrintText, | also provides functionality comparable to the <var>PrintText</var>, | ||
AuditText, and TraceText statements, and it is especially useful for displaying | <var>AuditText</var>, and <var>TraceText</var> statements, and it is especially useful for displaying | ||
multiple lines of data. | multiple lines of data. | ||
So it is recommended that you discontinue the use of the Print, Audit, and | So it is recommended that you discontinue the use of the <var>Print</var>, <var>Audit</var>, and | ||
Trace statements in favor of PrintText, AuditText, and TraceText | <var>Trace</var> statements in favor of <var>PrintText</var>, <var>AuditText</var>, and <var>TraceText</var>. | ||
==See also== | |||
<ul> | |||
<li>[[Object oriented programming in SOUL]] | |||
</ul> | |||
[[Category:System classes]] | [[Category:System classes]] | ||
[[Category:Overviews]] | [[Category:Overviews]] | ||
[[Category:SOUL object-oriented programming topics]] |
Latest revision as of 21:27, 13 April 2016
You may want to go directly to:
Definition of Intrinsic classes
In “pure” object-oriented languages, all datatypes are objects, that is, all datatypes are extension classes of some base object class. This means that even things that one might not immediately think of as objects, such as character strings or numbers, are, in fact, considered to be objects.
There are two aspects to the assertion that strings or numbers are objects:
- They are internally managed in exactly the same was as any other objects. That is, a string or numeric variable is actually a reference to an object that contains the string or numeric value, rather than itself directly containing the string or numeric value.
- They are syntactically identical to other objects. That is, the syntax for manipulating strings or numbers is the same as the syntax for manipulating other objects. More specifically, methods are applied to strings or numbers using the exact same syntax as when methods are applied to other objects.
Since the first aspect is concerned with the internal management of strings or numbers, it is largely irrelevant from the perspective of a programmer using a “pure” object- oriented language. In fact, many languages that profess to be “pure” actually “cheat,” internally, and they special-case string and numeric data. They do so because string and numeric data is so important, and so heavily used in all programming languages, that insisting on not treating it specially (internally) is pedantry that has a cost in performance. To see why, consider a statement that adds two numbers together (the language for the example is, of course, SOUL):
%x is float %y is float ... %x = %y + 13
Insisting that numbers are no different from any other object means that this statement
must get the value referenced by %y
, add it to the number referenced by the literal 13
,
create a new Float instance and set %x
to reference that new instance. Clearly, this
would be less efficient than taking the number in %y
, adding 13 to it, and setting %x
to
the result. Fortunately, the way this is actually done is purely an under-the-covers
implementation issue, so one can always pretend that “pure” object-oriented processing
is being used.
This is so because numbers and strings are an example of a special class of objects
called immutable objects. Immutable objects, as the name suggests, cannot be
changed once they are created.
To illustrate the difference between immutable objects and mutable objects, consider two variables:
%count is float %account is object bankAccount
If %account
were set to a new object instance, it would not be surprising if, after it's set,
the object changed. For example, it wouldn't be surprising if the account balance for
%account
changed, even if the change was done via a different variable:
print %account:balance %myAccount = %account %myAccount:addToBalance(23) print %account:balance
In this example, it is expected that the balance displayed in the second Print
statement
will be 23 greater than that displayed by the first.
On the other hand, given the following:
print %count %myCount = %count %myCount:addToValue(23) print %count
It would be surprising for some operation on %myCount
to have changed the value in
%count
. In general, for variables that contain numeric or string values, one would only
expect the value to change when a new value is assigned to it. In fact, one would not
expect to see methods like AddToValue
that modify the method object for a numeric
datatype.
More typical for a numeric datatype would be a method that manipulates the value and produces a new value (object):
%myCount = %myCount:addToValue(23)
Although even pure object-oriented languages have special syntax for the common operation of addition:
%myCount = %myCount + 23
Absence of methods that modify a value is what make a class immutable, and in “pure” object-oriented languages, basic string and numeric datatypes are always immutable.
Intrinsic methods in SOUL
SOUL is a legacy programming language for which object-oriented capabilities are a relatively recent addition. So, one might expect that strings and numbers are not maintained internally as objects, since their existence pre-dates objects. This is in fact the case, though as was shown, this is largely irrelevant from a SOUL programmer's perspective, and even if object-oriented capabilities were present in SOUL from its inception, strings and numbers might have been special-cased for efficiency anyway.
Float | See List of Float methods |
---|---|
String | See List of String methods |
Unicode | See List of Unicode methods |
The benefits of object-oriented syntax in SOUL
Beyond the internal representation of strings and numbers, a distinction between SOUL and pure object-oriented languages was that the object:method syntax was not used to manipulate strings and numbers; instead, strings and numbers were manipulated via $functions. This was changed to allow the object:method syntax against string and numeric variables and constants.
The following code fragment illustrates how the same operation can be accomplished with traditional $functions and with object-oriented syntax:
%name is string len 32 ... print $len(%name) print %name:length
While this might not seem very significant, it provides considerable value:
- It allows SOUL to be used as a “pure” object-oriented language. This might be especially appealing to programmers who were trained in a pure object-oriented language.
- It provides the benefit that expressions can generally be read in the natural left-to-right manner rather than the inside-to-outside manner required to understand an expression coded with $functions.
- It provides capabilities to methods that operate on strings or numbers that are not available with $functions. These include support for named parameters and the ability to take objects as input parameters and to produce objects as results. While it would have been possible to extend $functions to have this same functionality (much as Sirius Mods provided callable $function support), it makes more sense to provide it using true object-oriented syntax. Given that this has now been done, it is unlikely that these capabilities will ever be added to $functions.
Two generic intrinsic classes: string and numeric
SOUL traditionally allowed declarations of many different datatypes. These included Strings of a specified length with a possible DP value, Fixed numerics (also with a possible DP), and Float numerics. In addition, Sirius Mods provided support for Longstring datatypes. And images provide support for a variety of additional datatypes, including packed and zoned formats.
Essentially, however, there are two categories of datatypes in SOUL: string and numeric. In SOUL one can easily assign values from one datatype to another, even between numeric and string types. This is intuitive and useful. Typically, in the “real world” one doesn't distinguish between the string representation of a number and its numeric value used for calculation. Similarly, except for performance purposes and, perhaps, value limits, one is typically not concerned about how a value is stored internally.
This is not the case in some “pure” object-oriented languages where the paradigm of strong-datatyping is extended to numeric and string datatypes. This means that if one wants to display the value of a numeric variable, one must explicitly convert it to a string (since what one displays is a string). Similarly, if one wishes to do a calculation using a value in a string (perhaps read from an external data source), one must explicitly convert the string to a number. It is asserted here that strong-datatyping for strings and numbers is a mistake, allowing some vision of purity to prevent programmers from doing something completely natural (treating numbers as strings, and strings as numbers) and something that is done hundreds of times in any program of any size. This view is reinforced by the fact that the general industry trend is away from strong-datatyping for strings and numbers and toward implicit conversion between these types.
It is worth pointing out, however, that loose-datatyping between strings and numbers does not imply loose-datatyping between strings and numbers and other classes. That is, there are generally no implicit conversions of strings or numbers to or from non-string, non-numeric objects.
Because loose-datatyping and the existing SOUL datatypes work quite well in facilitating rapid development of reasonably tight and efficient code, support for methods against string and numeric types does not add any new datatypes to SOUL. And, because of loose-datatyping, the numeric and string methods can be applied to any of the standard SOUL string and numeric datatypes. Because they apply to these intrinsic SOUL datatypes, these methods are called intrinsic methods.
Intrinsic methods can be further classified into these subsets:
Float | Methods that perform numeric manipulation on values in Model 204's Float format. |
---|---|
Fixed | Methods that perform numeric manipulation on values in Model 204's Fixed format (with an assumed DP of 0). |
String | Methods that perform string manipulation, with Longstring capability assumed. |
Unicode | Methods that perform Unicode string manipulation, return Unicode results, or are based on the Unicode tables. |
The Float intrinsic class can really be thought of as a generic numeric class, as it is a convenient way of representing both integer and decimal data. This is different from most other programming languages where a Float datatype suffers from inconvenient behavior, especially for decimal numbers.
The Float class is so convenient that all numeric parameters to system methods and $functions are treated as Float parameters. Also, because of the convenience of the Float class, there are currently no methods in the Fixed class. As such, the intrinsic methods can be thought of broadly as belonging to two intrinsic classes: the Numeric class and the String class. These classes behave largely as if their inputs were SOUL Float and Longstring datatypes, respectively. That is, for all intents and purposes, the following are true:
- Any non-Float values, including Unicode, are converted to Float values before being processed by a Numeric method. This includes the conversion of any non-numeric strings into 0 that occurs in other SOUL contexts that take a numeric input.
- For the purposes of truncation and the With operation, all String method string inputs or outputs behave as Longstrings.
Strings and numbers as method objects
The term "method object" (or "intrinsic method object") is used for the value or variable to which an intrinsic method is applied, even though the value or variable isn't really an object. This is justified because, as noted above, strings and numbers can be considered immutable objects, regardless of their history or internal representation.
For example, in the following case,
%str
is the intrinsic method object for the Length
method:
%str is string len 32 ... print %str:Length
Intrinsic methods can be applied to constants, in addition to variables.
For example, the following assigns the length of the string literal
Whatever
to %x
:
%x = 'Whatever':length
Intrinsic methods can take the output from another method, intrinsic or not, as its input. For example, the following uses the Right method (which gets the rightmost characters of a string) against the output of a Stringlist Item method:
%list is object stringlist ... %value = %list:item(%i):right(8)
Since it is unnecessary to explicitly specify the Item method for a Stringlist, the above can also be written as:
%list is object stringlist ... %value = %list(%i):right(8)
Intrinsic methods can also be applied to SOUL Image and Screen items:
image michigan romulus is string len 32 ... end image ... %value = %michigan:romulus:right(8)
And Intrinsic methods can be applied to the output from a $function:
%seconds = $time:right(2)
Intrinsic methods can even be applied to the results of expressions:
... %value = (%tweedledum with %tweedledee):right(8)
Automatic "implicit" conversion of intrinsic values
As with other methods, the colon that separates the intrinsic method object from the method name can be optionally preceded or followed by spaces. This could be done to enhance readability, or even to split a long line. The following four statements are all equivalent:
%value = %list:item(%i):right(8) %value = %list: item(%i): right(8) %value = %list :item(%i) :right(8) %value = %list : item(%i) : - right(8)
As with most other uses of intrinsic variables or values, if the method object is of a
different type than the datatype on which the method operates, the input value is
automatically converted into the target datatype. For example, the expression 7/3
clearly produces a numeric value, but the Left method operates on strings. So, in the following statement:
%i = (7/3):left(4)
The result of the division is converted into a string and then passed to Left, which
produces 2.33
as its result.
Because of this automatic conversion, the specific class (String, Float, Unicode) of an intrinsic method cannot be determined from the datatype of its method object. This means that the class must be determined only from the name of the method, which means that method names must be unique among all intrinsic classes (that is, among String, Float, and Unicode). For example, there may not be a Length method in both the String and Float intrinsic classes. In the case of Length, the method is an intrinsic String method, there is no comparable Float method, and UnicodeLength is the comparable Unicode method.
Note: Because some EBCDIC characters are not translatable to Unicode, and vice versa, automatic conversions involving Unicode values may cause request cancellation. The explicit intrinsic conversion methods (like EbcdicToUnicode and EbcdicToAscii and their counterparts UnicodeToEbcdic and AsciiToEbcdic), however, have a parameter (as of Sirius Mods version 7.6) that lets you decode or encode untranslatable characters.
Intrinsic method syntax: special cases
SOUL is a legacy programming language that supports some unusual syntax. Some of this syntax causes problems for the use of intrinsic methods, as is described for the following three cases:
Intrinsic methods against database field names
Field names have very loose naming rules, and names can contain colons
and spaces.
Because of this, a field name must be contained inside of parentheses
to be used as a method object.
For example, the field big fat greek field
is used as the input to the intrinsic Substring method:
for each record in %recordset ... %value = (big fat greek field):substring(2, 3) ... end for
Intrinsic methods against percent variables and images that have the same name
SOUL allows one to declare images and percent variables with the same name
in the same scope.
That is, inside a method, you can declare an Image called holland
and a %variable called %holland
.
References to items in the image and to the percent variable both start
with %holland
:
image holland ... factory is float ... end image ... %holland is string len 10 ... %a = %holland:factory ... %a = %holland
Historically, this has not been a problem because a percent variable
would never be followed by a colon.
However, with intrinsic methods, this is no longer the case.
If the Holland
image contained an item called Length
, it would
be impossible to tell whether %holland:length
referred
to the Length
item in the Holland
Image or the intrinsic Length
method applied to %holland
.
Because of this ambiguity, intrinsic methods cannot be used against percent variables without a blank between the variable name and the colon if there is an Image with the same name as the percent variable. This is true regardless of whether or not there is an actual conflict between the method name and an image item name.
So, using the above example, to apply the length method to
%holland
, you must do one of the following:
%a = %holland :length %a = %holland : length
Specifying %holland: length
would not work.
A space after the variable name indicates that the reference
is not to the Image, because Image item references do not allow any
blanks between the Image name, colon, and Image item name.
You can also simply wrap the variable name in parentheses:
%a = (%holland):length
Of course, the best solution is to avoid using the same name for images and percent variables, as this is generally somewhat confusing, anyway.
Intrinsic methods in a Print, Audit, or Trace statement
The SOUL Print, Audit, and Trace statements all use somewhat unusual syntax. While initially these statements appear to operate on expressions, just like an assignment, this is not really the case. For example, the following statement gets a compilation error:
print 3*4
Because of this, one has to be careful using intrinsic methods in a Print, Audit, or Trace statement. The general recommendation is to use PrintText, AuditText, or TraceText, as described below.
There are several specific syntax problems with Print, Audit, and Trace:
- Because blanks are treated specially in these statements, you
may not put a blank before the colon in an intrinsic method invocation.
That is, the following is incorrect:
print %x :length
But the following two statements are allowed:
print %x:length print %x: length
- String and literal constants are treated specially by these
statements, so you cannot issue methods against constants in a Print, Audit,
or Trace statement.
That is, the following are incorrect:
print 'Foobar':length print 22:squareRoot
- Although you might be tempted to use parentheses to get around some of
these issues, leading parentheses are not allowed
in a Print, Audit, or Trace statement token.
That is, the following is incorrect:
print (fieldname):length
Coupled with the fact that applying intrinsic methods to fields generally requires use of parentheses, this means that you cannot display the result of an intrinsic method applied to a field with the Print, Audit, and Trace methods.
Fortunately, the newer PrintText, AuditText,
and TraceText statements (see Targeted Text statements) are direct analogs of Print, Audit, and Trace, respectively, but they use
a more consistent syntax.
Specifically, the newer statements treat everything as literal text, except
for that which is enclosed within the expression start and end characters (which default to curly braces: {...}
),
which is treated as a standard SOUL expression.
This means that the syntax used for variable parts of PrintText, AuditText,
and Tracetext statements is identical to the syntax allowed on the right
side of a variable assignment.
So, the following is valid:
printText {3*4}
And this is valid:
printText {%x :length}
These are valid statements:
printText {'Foobar':length} printText {22:squareRoot}
And this is valid:
printText {(fieldname):length}
The Text statement also provides functionality comparable to the PrintText, AuditText, and TraceText statements, and it is especially useful for displaying multiple lines of data.
So it is recommended that you discontinue the use of the Print, Audit, and Trace statements in favor of PrintText, AuditText, and TraceText.