Getting started with OOP for User Language programmers: Difference between revisions
mNo edit summary |
m (minor formatting) |
||
(21 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
[[Category:Tutorial]] | [[Category:Tutorial]] | ||
==Background== | ==Background== | ||
So, you're a <var class="product">[[User Language]]</var> programmer and you're thinking about learning object-oriented programming: | So, you're a <var class="product">[[User Language]]</var> programmer and you're thinking about learning object-oriented programming (OOP): | ||
<ul> | |||
<li>Rumor has it you can be a more effective programmer if you use object-oriented techniques. | |||
<li>You're tired of feeling inferior to object-oriented programmers because they speak a language that you don't understand but which sure as heck sounds impressive. | |||
<li>New <var class="product">User Language/SOUL</var> functionality will use object-oriented syntax. | |||
<li>"Object-oriented" looks better on your resume than "User Language." | |||
<li>You just want to learn something new. | |||
</ul> | |||
Unfortunately, most <var class="product">User Language</var> programmers' first experience with object-oriented programming is painful and bewildering. Often it comes in the form of a VB.Net or Java class where the terminology flows freely from day one. Even worse: | Unfortunately, most <var class="product">User Language</var> programmers' first experience with object-oriented programming is painful and bewildering. Often it comes in the form of a VB.Net or Java class where the terminology flows freely from day one. Even worse: | ||
<ul> | |||
<li>Classes often emphasize how one architects an object-oriented application. While this might be a logical way to build an application, it's a daunting way to learn a language — like trying to learn ballroom dancing before you know how to walk. | |||
Before going further, a word about terminology: In English-speaking (as opposed to American-speaking) countries, object-oriented is usually called "object-orientated." More commonly, object-oriented programming is just called " | <li>Teachers (and books about object-oriented programming) are often enamored with the more sophisticated aspects of object-oriented programming languages, leaving in the dust novices still struggling to digest the simpler concepts. | ||
<li>Many object-oriented concepts are interrelated, so it often requires plowing ahead without fully understanding the concepts one has already learned. | |||
<li>There was no way to put the concepts learned in a Java or VB.Net class to use in <var class="product">User Language</var>. So one is forced to work on object-oriented programming in one's free time, or the concepts are quickly forgotten. | |||
</ul> | |||
Fortunately, there <strong>is</strong> a way you can learn object-oriented programming and apply the principles on-the-job from day one! | |||
One requirement for this is to work at a site where <var class="product">[[SOUL]]</var> (thus <var class="product">Model 204</var> V7.5 or higher) is available (or where at least some of the other Janus products, such as <var class="product">Janus Web Server</var> or <var class="product">Janus Sockets</var>, are available). | |||
Before going further, a word about terminology: In English-speaking (as opposed to American-speaking) countries, object-oriented is usually called "object-orientated." More commonly, object-oriented programming is just called "OO." | |||
==Mixed case code== | ==Mixed case code== | ||
So, let's get started. First, if you're going to do | So, let's get started. First, if you're going to do OO programming, you've got to write your code in mixed case. No, there is no technical reason OO code in <var class="product">User Language</var> must be in mixed case, but: | ||
<ul> | |||
<li>It is easy enough to do, and even if you don't use OO, it makes your code look more modern. | |||
<li>The industry consensus is that descriptive, often compound, words are better for variable, function, and subroutine names than terse non-descriptive names. For example, %itemNumber is better than %ITMN. While the latter is easier to type, the former is much easier to read. Using %ITEMNUMBER, on the other hand, clearly blunts some of that readability benefit. %itemNumber is written in what's called [http://en.wikipedia.org/wiki/CamelCase CamelCase]. | |||
<li>OO SOUL depends on CamelCase to make function and subroutine names readable. | |||
</ul> | |||
Fortunately, it's easy to start using mixed case code. Simply start typing your code in mixed case. What can possibly go wrong? If your system manager has set everything up nicely for you, nothing. But, if not, you might have a few glitches that are easy enough to fix: | Fortunately, it's easy to start using mixed case code. Simply start typing your code in mixed case. What can possibly go wrong? If your system manager has set everything up nicely for you, nothing. But, if not, you might have a few glitches that are easy enough to fix: | ||
<ul> | |||
<li>If you use the Model 204 editor and type in mixed case code, and it gets converted to upper case when you enter it, you have two options: | |||
<ul> | |||
<li>Set *LOWER at command level. But this has the drawback that Model 204 commands now require holding down the shift key, because mixed case Model 204 commands do not work. | |||
<li>Set the SIREDIT user parameter to X'33' (only the X'01' bit is required, but you may as well set some others) before entering the editor. This causes Model 204 to essentially switch to *LOWER mode before entering the editor. Request that your system manager do this for everyone by setting SIREDIT X'33' in CCAIN. | |||
</ul> | |||
<li>If you get compilation errors when you type in mixed case <var class="product">SOUL</var>, it means that the compiler is not running in case-independent mode. To fix this, you have these options: | |||
<ul> | |||
<li>Have your system manager set the X'01' bit in the <var>[[COMPOPT parameter|COMPOPT]]</var> system parameter. This must be done in CCAIN, and it is probably the best option. | |||
<li>Change the BEGIN or B statement at the start of the request you're working on to use mixed case: that is, <code>Begin</code>, <code>begin</code>, or <code>b</code>. | |||
<li>If you can't change the start of the program, add the line <code>Sirius Case ToUpper</code> (the case of the words doesn't matter) to the start of the procedure you are working on. | |||
</ul> | |||
The following is an example of a mixed case <var class="product">User Language</var> program that starts with a mixed case | </ul> | ||
The following is an example of a mixed case <var class="product">User Language</var> program that starts with a mixed case <var>Begin</var>: | |||
<p class="code">begin | |||
The following is an example of an | print 'Hello World!' | ||
end </p> | |||
The following is an example of an <var>Include</var>d procedure that contains a <code>Sirius Case ToUpper</code> directive: | |||
<p class="code">sirius case toUpper | |||
subroutine hello | |||
Note that mixed case <var class="product">User Language</var> support is case independent. You can write <var class="product">User Language</var> statements in any case, and you can specify Model 204 variables with any case. The following illustrates a little island of mixed case code in the middle of some | print 'Hello World!' | ||
end subroutine </p> | |||
Note that mixed case <var class="product">User Language</var> support is case independent. You can write <var class="product">User Language</var> statements in any case, and you can specify Model 204 variables with any case. The following illustrates a little island of mixed case code in the middle of some uppercase code: | |||
<p class="code">SUBROUTINE FOOBAR(%INPUT IS FLOAT) | |||
%MSG IS STRING LEN 32 | |||
%TOTAL IS FLOAT | |||
... | |||
if %input gt %total then | |||
%msg = 'Input value too big' | |||
end if | |||
This code illustrates the fact that you don't have to convert an entire procedure (or request) to mixed case to take advantage of mixed case <var class="product"> | ... | ||
PRINT %MSG | |||
END SUBROUTINE | |||
</p> | |||
This code illustrates the fact that you don't have to convert an entire procedure (or request) to mixed case to take advantage of mixed case <var class="product">SOUL</var>. Obviously though, in the long-term, it is a good goal to aim for relatively consistent casing in all your code. In the short term however, some inconsistency will have to be tolerated to get to the point where most or all <var class="product">SOUL</var> code is in mixed case. Certainly, any new procedures should be written completely in mixed case. | |||
This example also demonstrates that you don't have to learn anything new to enter <var class="product">User Language</var> in mixed case (<strong>all</strong> statements still work the same way), so there is no excuse not to start. | |||
==Object-oriented syntax== | ==Object-oriented syntax== | ||
The most pernicious difference between | The most pernicious difference between OO languages and procedural languages such as <var class="product">User Language</var> (called "UL" from here on) is the syntax. And the biggest syntactic difference between OO and UL is how functions and subroutines are invoked. Let's start with functions. All User Language programmers know how to invoke a function. | ||
First, functions are called $functions (dollar-functions) or £functions (pound-functions) in the UK. $functions (except | First, functions are called $functions (dollar-functions) or £functions (pound-functions) in the UK. $functions (except a very few) <strong>always</strong> return a value, so they must be on the right side of an assignment, input to a subroutine or other $function call, or inside some User Language expression. The following example, has <var>[[$Substr]]</var> in all three contexts: | ||
<p class="code">%x = $substr(%y, 3, 10) | |||
call clever($substr(%y, %start, %len)) | |||
%z = $substr(%y, %len, 1) + 10 | |||
As the above example | </p> | ||
As the above example shows, and all User Language programmers know, a $function can be followed by the $function arguments (inputs) in parentheses with multiple arguments separated by commas. | |||
OO functions, on the other hand, use a syntax where a function invocation consists of the thing (object, if you will) that the function is operating on specified <strong>before</strong> the function name, followed by its arguments inside parentheses. Many $functions have OO equivalents and <var>$Substr</var> is no exception: its OO equivalent is called <var>[[Substring (String function)|Substring]]</var>. The following illustrates the use of the OO <var>Substring</var> function by replacing "$substr" in the previous example with "substring": | |||
<p class="code">%x = %y:substring(3, 10) | |||
call clever(%y:substring(%start, %len)) | |||
%z = %y:substring(%len, 1) + 10 | |||
</p> | |||
To further add to your bona fides as an object oriented programmer, don't call | While this might look strange to a <var class="product">User Language</var> programmer, it uses the most common OO syntax so can honestly be called OO programming. So, if you find any <code>$substr</code> call in your system and change it to use <var>Substring</var> (moving the first argument before a <code>:substring</code>), you have now done some OO coding. It's that easy! | ||
To further add to your bona fides as an object oriented programmer, don't call <var>Substring</var> a function, but call it a '''method'''. In addition, don't call <code>%y</code> just a string, call it a '''string object'''. Now you're ready for something more advanced. Say "I applied the <var>Substring</var> method to the <var>String</var> object". Repeat until it feels natural to say it. Congratulations, you're well on your way to becoming an OO programmer and, maybe even a guru. If you're curious about what you just said: | |||
Now, of course, there are many functions available other than just Substring. The functions that operate on strings are called ''Intrinsic String | <ul> | ||
<li>Method is just a fancy word that means function or subroutine or any other called code that does something. </li> | |||
<li>Object is just a fancy word for a "thingy." </li> | |||
</ul> | |||
Now, of course, there are many functions available other than just <var>Substring</var>. The functions that operate on strings are called '''Intrinsic String methods'''. You can find a list of the methods at [[List of String methods]]. There are also functions that operate on numbers. The list of these can be found at [[List of Float methods]]. | |||
Now, while congratulating yourself on your new-found skills, you might have | Now, while congratulating yourself on your new-found skills, you might have a gnawing feeling that you really haven't accomplished a lot, as much as you have learned. So what have you accomplished? Is OO syntax really better than what you're used to? At first blush, it would seem worse, as the traditional $function call is more like English, where verb (function name) precedes the object (first parameter), as opposed to the OO syntax, where the order is reversed. For example, one would say "get the substring of %y" not "%y get the substring," so | ||
<code>%x = $substr(%y, 2, 3)</code> <strong>seems</strong> more natural than | |||
<code>%x = %y:substring(2, 3)</code>. | |||
While this might be true, it's easy enough to get used to the second form — in many languages the object comes before the verb. | While this might be true, it's easy enough to get used to the second form — in many languages the object comes before the verb. | ||
In any case, the chief advantage of | In any case, the chief advantage of OO syntax is that because the input of a function is to the left of the function, one can invoke a function and then pass the result of that function to another function by placing the second function call after the first. Similarly, a third function can be placed to the right of second to process its output. To illustrate, consider the following: | ||
<p class="code">%x = %y:substring(%start, %len):toUpper:unspace </p> | |||
This reads rather nicely, left to right: take | |||
This reads rather nicely, left to right: take "%y", get a substring, convert to uppercase, and remove extra blanks. The traditional version doesn't read nearly as nicely: | |||
Since the processing happens in inside out order, presumably one should read this code inside out but, of course, this is very difficult. In addition, because only a colon is required as a separator character, and because no dollar-sign is required to distinguish a function from other entities, the | <p class="code">%x = $unspace($upcase($substr(%y, %start, %len)))</p> | ||
Since the processing happens in inside out order, presumably one should read this code inside out but, of course, this is very difficult. In addition, because only a colon is required as a separator character, and because no dollar-sign is required to distinguish a function from other entities, the OO version is shorter, in spite of the fact that the function names are somewhat longer, and so, more meaningful. Perhaps more important, the OO expression contains fewer "noise" characters. All of these lead to much better readability for the OO expression. And, just as spaces can be placed around parentheses in a $function invocation to improve readability, spaces can be placed around the colon used to separate the object and function name: | |||
<p class="code">%x = %y : substring( %start, %len ): toUpper :unspace </p> | |||
This example used inconsistent spacing just to illustrate what's allowed, not to suggest that inconsistent spacing is recommended — it's not. | This example used inconsistent spacing just to illustrate what's allowed, not to suggest that inconsistent spacing is recommended — it's not. | ||
So, to continue the process of becoming an object-oriented <var class="product">User Language</var> programmer, you should familiarize yourself with the list of intrinsic String and Float methods, and try to use them wherever possible and in lieu of the $function equivalents. There is no performance penalty for doing so — in some cases OO functions and $functions share the same underlying code. | |||
===Named parameters=== | ===Named parameters=== | ||
The | The SOUL object-oriented implementation has support for named parameters, parameters that can be specified by name, rather than position. While, strictly speaking, this has nothing to do with OO programming (in fact, few OO languages support it — Java and VB.Net do not) named parameters are used in many SOUL methods, so it is important to understand them. To specify a value for a named parameter, simply specify the name, followed by an equals sign (<tt>=</tt>), followed by the value. For example, the <var>Right</var> and <var>Left</var> String functions have a <var>Pad</var> named parameter that indicates the pad character to be used if the input string is shorter than the requested length: | ||
<p class="code">%x = %y:right(20, pad='0') </p> | |||
In this case, the name simply makes the code clearer | |||
In this case, the name simply makes the code clearer, since without the following name, | |||
it's far less obvious what the second parameter means: | |||
<p class="code">%x = %y:right(20, '0') </p> | |||
In functions with large numbers of parameters, the named parameters can also be very useful for eliminating the need for placeholder commas, and for making the function invocations more readable. String and Float methods tend not to have a lot of parameters so named parameters are not heavily used for them, but they <strong>are</strong> used here and there, so it is important to understand them. | |||
==Stringlists== | ==Stringlists== | ||
The object-oriented extensions to <var class="product">User Language</var> | The object-oriented extensions to <var class="product">User Language</var> use [[$lists]]. Most sites that <i>can</i> use $lists, <i>do</i> use $lists, because many [[List of $functions|$functions]] require them as input or outputs, and because they are just generally useful. | ||
$lists are essentially objects because an object is a container for information that is accessed via a reference variable, and multiple reference variables can refer to the same object ($list). For example: | $lists are essentially objects because an object is a container for information that is accessed via a reference variable, and multiple reference variables can refer to the same object ($list). For example: | ||
<p class="code">%list is float | |||
%list2 is float | |||
... | |||
%list = $listNew | |||
%list2 = %list | |||
$listAdd(%list, 'Now is the winter') | |||
print $listInf(%list2, 1) | |||
In this example, the | </p> | ||
Both | In this example, the <code>Print</code> statement would end up printing "Now is the winter" even though the <code>$listAdd</code> added the line to <code>%list</code>. | ||
An experienced UL programmer might raise an eyebrow at the line that contains the | Both <code>%list</code> and <code>%list2</code> point to the same $list (object) and so, it is not surprising that what is added via <code>%list</code> can be seen via <code>%list2</code>. It is clear, too, that <code>%list</code> and <code>%list2</code> must be '''pointers''' to $list objects and cannot be the objects, themselves, since a <var>Float</var> value couldn't possibly hold the contents of a $list. | ||
An experienced UL programmer might raise an eyebrow at the line that contains the <code>$listAdd</code> in the above example. <var>$listAdd</var> is a '''function''', so it returns a value. But the <code>$listAdd</code> is not on the right side of an assignment. What gives? Well, the result of many $functions is usually "uninteresting" to the User language code, so it seems silly to have to find a variable to which to assign it. So SOUL allows <strong>certain</strong> $functions to be invoked without assigning its result to anything. This causes the $function to behave more like a subroutine as subroutines are called without obtaining a return value from the subroutine. It is this blurring of the distinction between functions and subroutines that leads to the OO term '''method''', which is a convenient way of referring to either. | |||
So | So when you were using $lists, you were using object-oriented programming capabilities, even though you might not have realized it. However, you were <strong>not</strong> using object-oriented syntax to access the $list objects, so the benefits of a pure object-oriented facility were not available to you. Now, you might just say "Hold on, didn't you say strings are also objects?" Indeed they are. However, they're objects that can't be changed once they're set (so-called '''immutable''' objects) — you can't really change a string that's been assigned to a variable, you can only assign a new string to that variable. Most objects are not immutable, though of course, strings and numbers are <i>heavily</i> used, so they have importance disproportionate to their number. Nevertheless, the thing that distinguishes OO languages from non-OO languages is the presence of changeable (non-immutable) objects, so $lists can be viewed as a step toward object-oriented programming. | ||
The pure object-oriented | The pure object-oriented equivalent of $lists are called '''Stringlists'''. And, using a <var>Stringlist</var> is very similar to using a $list. To illustrate, let's take the above example and convert it to using <var>Stringlists</var>: | ||
<p class="code">%list is object stringlist | |||
%list2 is object stringlist | |||
... | |||
%list = new | |||
%list2 = %list | |||
%list:add('Now is the winter') | |||
print %list2:item(1) </p> | |||
Your first reaction might be an understandable "So what?" Well, let's take a look at what we've accomplished. First, we've made it much clearer what's in %list and %list2: references to Stringlists. This prevents accidents such as someone accidentally incrementing the value of a %list — the following would | Your first reaction might be an understandable "So what?" Well, let's take a look at what we've accomplished. First, we've made it much clearer what's in <code>%list</code> and <code>%list2</code>: references to <var>Stringlists</var>. This prevents accidents such as someone accidentally incrementing the value of a %list — the following would <strong>not</strong> be allowed: | ||
<p class="code">%list is object stringlist | |||
... | |||
%list = %list + 1 <span style="color:red;"><==Invalid</span> </p> | |||
Nor, would the following: | Nor, would the following: | ||
<p class="code">%list is object stringlist | |||
%list2 is float | |||
... | |||
%list = %list2 <span style="color:red;"><==Invalid</span> </p> | |||
While this might not seem | |||
While this might not seem that useful, it can be much more useful in complex applications with lots of different kinds of objects — except for strings and numbers, only object variables of the same "class" can be assigned to each other. "Wait a minute, what's a class?" you say. A '''class''' is simply a name for objects with identical characteristics. In this case, all <var>Stringlists</var> have a similar structure and have a fixed set of things you can do with them (like $lists). The distinction between a class and individual objects is the same as in normal usage — my blue Subaru is an object in the class of cars. | |||
In any case, beyond the protection of accidental assignment mismatches, what else has using an | In any case, beyond the protection of accidental assignment mismatches, what else has using an OO <var>Stringlist</var> instead of a traditional $list bought us? Well, compare: | ||
<p class="code">$listAdd(%list, 'Now is the winter') </p> | |||
with | with | ||
<p class="code">%list:add('Now is the winter') </p> | |||
In the former, when the UL compiler hits the $listAdd it has no idea that it's working with a $list. So, to distinguish $listAdd from other functions that might be doing an add of something to something, the word | |||
In the former, when the UL compiler hits the <code>$listAdd</code>, it has no idea that it's working with a $list. So, to distinguish <code>$listAdd</code> from other functions that might be doing an add of something to something, the word "list" has to be used as a qualifier in the $function name — it would not do to have the $function called simply <code>$add</code>. But, with the OO syntax, the compiler first hits the <code>%list</code> so now "knows" that whatever follows must be something specific to a <var>Stringlist</var>. So it is sufficient to use the function name <code>Add</code> because it can only mean the <var>Add</var> that applies to <var>Stringlists</var>, not to other classes. This means the following: | |||
<ul> | |||
<li>The OO statements tend to be shorter, because the extra qualifiers on the function names are unnecessary. This is often true even when the OO statements use longer, more meaningful function names than the non-OO equivalents. | |||
<li>The absence of the extra qualifiers eliminates a lot of the "noise" in OO statements. Since $lists names usually have the word "list" in them to keep them straight from other variables, and since all the $list functions start with "$list", code that manipulates $lists often turns into the actual functional code swimming in a sea of the word "list". | |||
</ul> | |||
Consider, for example, code that subsets, sorts, and then prints the contents of a $list: | Consider, for example, code that subsets, sorts, and then prints the contents of a $list: | ||
<p class="code">$list_print($listSort($listSub(%list, 'foo'), '1,10,A'))) </p> | |||
Now, consider the | |||
Now, consider the OO equivalent: | |||
Not only is the | <p class="code">%list:subset('foo'):sortNew('1,10,A'):print </p> | ||
In the $function version, our eyes glaze over at the sea of | |||
Not only is the OO version considerably shorter, it is much easier to follow — again, it can be read left to right. | |||
In the $function version, our eyes glaze over at the sea of <code>$list</code>s and nested parentheses. While, this is an extreme example, it can safely be said that code using OO syntax is almost always tidier and easier to read than the traditional non-OO syntax. | |||
Unfortunately, unlike using the OO syntax for intrinsic (<var>String</var> and <var>Float</var>) data, there is no really good way to simply change lines of code here and there to gradually migrate from using the pseudo-object-orientation of $lists to the full-on "true" object-orientation of <var>Stringlists</var>. Fortunately, many uses of <var>Stringlists</var> are fairly localized, so in many cases, changing an application from using $lists to using <var>Stringlists</var> can be a matter of changing a few dozen lines of code or fewer. While this doesn't provide any game-changing improvements in functionality (there are few things one can do with <var>Stringlists</var> that one can't do with $lists), it will: | |||
<ul> | |||
<li>Make code tidier and easier to understand, especially should your organization hire younger programmers who are familiar with object-oriented programming. | |||
<li>Familiarize you with object-oriented syntax and object behavior. | |||
</ul> | |||
On this latter point, it is worth mentioning that because <var>Stringlists</var> are true objects, they correct the flakiness that can occur with $lists when the same <var>$listNew</var> (or other $list-creating statement) is executed multiple times. Because the same <var>$listNew</var> statement always returns the same $list identifier (number), situations where a <var>$listNew</var> is excuted multiple times can be buggy or confusing. Executing the same <var>New</var> operation for <var>Stringlists</var> multiple times, on the other hand, returns distinct <var>Stringlist</var> references and so eliminates the bugginess or confusion inherent in the $list equivalent. | |||
It is worth emphasizing that a <var>Stringlist</var> object variable does <strong>not</strong> contain the <var>Stringlist</var> object any more than a variable used to hold a $list identifier contains the $list itself. In both cases, the variable contains a '''reference''' to the underlying object. So assigning one <var>Stringlist</var> object variable to another does not create a copy of the object (it simply copies the reference to the underlying object), just as assigning a $list identifier from one variable to another does not copy the $list itself. Although there is a <var>Copy</var> function for <var>Stringlists</var>: | |||
<p class="code">%list is object stringlist | |||
%list2 is object stringlist | |||
... | |||
%list2 = %list:copy </p> | |||
The assignment of the references is much more efficient, of course, and in many cases, just what one wants, anyway. | |||
Note | <p class="note"><b>Note:</b> <var>Stringlists</var> are no more or less efficient than $lists; in fact, they share most of the same underlying code. </p> | ||
Also, all Stringlist methods are | Also, all <var>Stringlist</var> methods are <b>longstring capable</b>. That is, they a accept and return <var>[[Longstrings|Longstring]]</var> values (strings longer than 255 bytes). For historic reasons, the $list API contains a mix of longstring-capable and non-longstring-capable functions. Many of the non-longstring-capable $functions have extra parameters to deal with the fact that, while $list items could be longer than 255 bytes, they could only be processed 255 bytes at a time. The <var>Stringlist</var> API, written well after longstrings became available, didn't need to deal with any of these issues. This means that the $list API is considerably more complicated than the <var>Stringlist</var> API, without providing <i>any</i> additional functionality. This is just another reason to use the <var>Stringlists</var> in lieu of $lists. | ||
So, in keeping with the theme of this page the recommendations are: | So, in keeping with the theme of this page the recommendations are: | ||
<ul> | |||
<li>Find some existing $list code that is relatively localized, and convert it to use <var>Stringlist</var> objects. | |||
<li>Write any new code using <var>Stringlists</var> rather than $lists. | |||
</ul> | |||
==Collections== | ==Collections== | ||
So far, we have seen how Strings and numbers are simply objects, and can be manipulated using | So far, we have seen how Strings and numbers are simply objects, and can be manipulated using OO syntax without much effort. We've also seen that $lists were really objects, and that $list identifiers behaved very much like object variables, albeit [http://en.wikipedia.org/wiki/Kludge kludgily] because there was no formal object support in <var class="product">User Language</var>. As such, with a modicum of effort, an application that uses $lists could be modified to use <var>Stringlists</var>, tidying up the code without any loss of functionality. As our last exercise in learning object-oriented programming we now look at '''Collections'''. | ||
Collections are essentially object-oriented replacements for traditional Model 204 arrays, but, as you'll see, with considerably more functionality. | Collections are essentially object-oriented replacements for traditional Model 204 arrays, but, as you'll see, with considerably more functionality. | ||
Model 204 arrays, unlike $lists, are most certainly | Model 204 arrays, unlike $lists, are most certainly <strong>not</strong> objects in the OO sense, because <var>Array</var> variables are not references to array objects, they are the arrays themselves. There is no way to have two variables referencing the same array (except in the special case of arrays being passed as parameters), and arrays are created at compile-time, unlike objects (and $lists) which are created at run-time. In addition to making arrays non-OO, these restrictions can be problematic for applications: | ||
<ul> | |||
<li>Application coders must determine ahead of time what the largest number of array items will be and allocate space for them, at compile time. This is both: | |||
<ul> | |||
<li>Wasteful of server space, as often the maximum likely array size is likely to be much bigger than the commonly required one. | |||
<li>Insufficient, as invariably any arbitrary limit can and will ultimately be exceeded. | |||
</ul> | |||
<li>The inability to have multiple references to the same array can make certain code clumsy and unintuitive. | |||
</ul> | |||
Collections provide a nice object-oriented alternative to arrays. Like arrays, collection variables are declared as being composed of variables of a specific type. For example, an array of numbers might be declared as: | |||
<p class="code">%numbers is float array(20) </p> | |||
A comparable collection declaration would be something like: | A comparable collection declaration would be something like: | ||
<p class="code">%numbers is collection arraylist of float </p> | |||
The word | |||
The word <code>collection</code> here means exactly the same thing as "object," but it is used to distinguish collection objects, which have certain special characteristics (such as having an <var>Of</var> clause in the declaration) from other objects. Because collections are so common in OO applications, the keyword <code>collection</code> is actually optional, and the above declaration could be more simply written as: | |||
Probably, the first thing that jumps out at you is the fact that we don't have to declare (and, in fact, can't) the maximum number of items to be allowed in the Arraylist. This means that Arraylists will only use as much space as is needed for the data at hand and will not add a lot to server size requirements for the odd cases where an | <p class="code">%numbers is arraylist of float </p> | ||
Probably, the first thing that jumps out at you is the fact that we don't have to declare (and, in fact, can't) the maximum number of items to be allowed in the <var>Arraylist</var>. This means that <var>Arraylists</var> will only use as much space as is needed for the data at hand, and they will not add a lot to server size requirements for the odd cases where an <var>Arraylist</var> happens to have a lot of items. In fact, collection items are stored in CCATEMP on an as-needed basis, with only the currently referenced item being stored in the server. So, collections have very small server footprints. | |||
One appealing aspect of using an <var>Arraylist</var> is that, by and large, references to <var>Arraylist</var> items are <strong>identical</strong> to references to <var>Array</var> items. That is, to set <code>%x</code> to the third item in either an <var>Array</var> or <var>Arraylist</var>, one can simply do: | |||
<p class="code">%x = %numbers(3) </p> | |||
And to set the the <code>%i</code>'th item in either an <var>Array</var> or an <var>Arraylist</var>, one can do: | |||
<p class="code">%numbers(%i) = %x </p> | |||
So, all one has to do to switch from using an <var>Array</var> to an <var>Arraylist</var> is to simply change the variable declaration? Well, not quite. Suppose we had something like: | |||
<p class="code">%cost is float array(20) | |||
%ncost is float | |||
... | |||
o: for each occurrence of cost | |||
%ncost = %ncost + 1 | |||
%cost(%ncost) = value in o | |||
end for </p> | |||
Now, we change the above to: | Now, we change the above to: | ||
%cost is arraylist of float | <p class="code">%cost is arraylist of float | ||
%ncost is float | %ncost is float | ||
... | |||
o: for each occurrence of cost | |||
%ncost = %ncost + 1 | |||
%cost(%ncost) = value in o | |||
end for </p> | |||
The problem we hit immediately is that the <var>Array</var> version of code never explicitly created the array — the array is created at compile-time. <var>Arraylists</var>, on the other hand, being objects need to be created at run-time — the OO term for this being [http://en.wikipedia.org/wiki/Instantiation_%28computer_science%29 instantiation]. This is identical to the requirement that a $list be created with a <var>$listNew</var>. Not having created an instance of the <var>Arraylist</var>, we will get a [http://en.wikipedia.org/wiki/Null_pointer#Null_pointer null pointer] error in our first reference to the object. We can fix this problem one of two ways. One way is to simply create a new <var>Arraylist</var> instance before using it: | |||
<p class="code">%cost is arraylist of float | |||
%ncost is float | |||
... | |||
%cost = new | |||
o: for each occurrence of cost | |||
%ncost = %ncost + 1 | |||
%cost(%ncost) = value in o | |||
end for </p> | |||
But, it is also possible to tell the OO infrastructure to automatically create a new <var>Arraylist</var> the first time it is referenced: | |||
<p class="code">%cost is arraylist of float auto new | |||
%ncost is float | |||
... | |||
o: for each occurrence of cost | |||
%ncost = %ncost + 1 | |||
%cost(%ncost) = value in o | |||
end for </p> | |||
While in many instances this is a bad idea (since it can become unclear in the code where an object will be first created or where it's first intended to be created), experience has shown that, at least for collections, the use of <code>Auto New</code> usually works out fairly well. This is especially the case for applications where an <var>Array</var> was changed to an <var>Arraylist</var>. | |||
But we hit a second problem with the above code: an attempt to set item number <code>%ncost</code> in the <var>Arraylist</var> in the above example will almost cause request cancellation, because that item won't be there, yet. This is really no different from the <var>Array</var> code where, if you tried to add a 21'st item to the array, you would get a request cancellation because the array was declared to have 20 items. However, because the array is created at compile-time with 20 entries, one <strong>can</strong> simply set items 1 through 20. When switching the code to use an <var>Arraylist</var>, the code will have to be changed, somewhat: | |||
<p class="code">%cost is arraylist of float auto new | |||
%ncost is float | |||
... | |||
o: for each occurrence of cost | |||
%ncost = %ncost + 1 | |||
%cost:add(value in o) | |||
end for </p> | |||
While this makes it a bit more trouble to trivially change from using an <var>Array</var> to an <var>Arraylist</var>, often the populating of the array is isolated to one place, so the change is fairly trivial. Even better, it can eliminate the need for a count variable like <code>%ncount</code> since the code could be simply written as: | |||
<p class="code">%cost is arraylist of float auto new | |||
... | |||
o: for each occurrence of cost | |||
%cost:add(value in o) | |||
end for </p> | |||
"Hold on" you might say. "I need <code>%ncost</code> later on": | |||
<p class="code">for %i from 1 to %ncost | |||
%x = %cost(%i) | |||
... | ... | ||
end for </p> | |||
But, this is unnecessary. One nice thing about collections is you <strong>always</strong> have a <var>Count</var> function that returns the number of items in the collection so the above code could simply be written as: | |||
<p class="code">for %i from 1 to %cost:count | |||
%x = %cost(%i) | |||
... | ... | ||
end for </p> | |||
In addition to eliminating the need for the <code>%ncost</code> variable, this eliminates the chance of a bug where <code>%ncost</code> doesn't accurately reflect the count of items on the <var>Arraylist</var>. Of course, if you have a <strong>lot</strong> of references to <code>%ncost</code>, you might prefer not to fix them all up, in spite of the benefits. In this case, you could simply leave the incrementing of <code>%ncost</code> in the loop that populates the <var>Arraylist</var> or, if you're compulsive about performance, set it after the loop: | |||
<p class="code">o: for each occurrence of cost | |||
%cost:add(value in o) | |||
end for | |||
%ncost = %cost:count </p> | |||
So, hopefully, it's clear that with minimal effort, an application can be changed from using an <var>Array</var> to an <var>Arraylist</var>. This would have the immediate benefit of reducing the server size required by the application while, at the same time, reducing any limit on the number of items one might put into the <var>Arraylist</var>. Even better, a wide variety of <var>Arraylist</var> functions become available once you've made this change. | |||
There are functions to make copies of <var>Arraylists</var>, to sort them, to search them, to insert or remove items, to find minima and maxima, and much more. In addition, you have the power of multiple references to the same <var>Arraylist</var>. You can even have <var>Arraylists</var> of <var>Arraylists</var>, if that's useful. | |||
So, hopefully, it's clear that with minimal effort, an application can be changed from using an Array to an Arraylist. This would have the immediate benefit of reducing the server size required by the application while, at the same time, reducing any limit on the number of items one might put into the Arraylist. Even better, a wide variety of Arraylist functions become available once you've made this change. | |||
There are functions to make copies of Arraylists, to sort them, to search them, to insert or remove items, to find minima and maxima, and much more. In addition, you have the power of multiple references to the same Arraylist. You can even have Arraylists of Arraylists, if that's useful. | |||
All that said, there might be cases where it's non-trivial to convert from using an Array to an Arraylist. Specifically, your application might depend on the fact that the Array is pre-populated with a specific number of items. For example, suppose in the above example, there was an array of 20 items that corresponded to 20 possible product numbers and total costs are calculated as: | All that said, there might be cases where it's non-trivial to convert from using an <var>Array</var> to an <var>Arraylist</var>. Specifically, your application might depend on the fact that the <var>Array</var> is pre-populated with a specific number of items. For example, suppose in the above example, there was an array of 20 items that corresponded to 20 possible product numbers and total costs are calculated as: | ||
<p class="code">o: for each occurence of product.no | |||
%cost(product.no) = %cost(product.no) + cost(occurrence in o) | %cost(product.no) = %cost(product.no) + cost(occurrence in o) | ||
end for | end for </p> | ||
Such an application does not lend itself to using the Arraylist | |||
Such an application does not lend itself to using the <var>Arraylist</var> <var>Add</var> method, since the product numbers won't necessarily arrive in numeric order. One approach would be to pre-populate an <var>Arraylist</var> with zeros: | |||
<p class="code">for %i from 1 to 20 | |||
%cost:add(0) | |||
But this somehow seems a bit | end for </p> | ||
But this somehow seems a bit <i>ad hoc</i> and dissatisfying. Another approach for such an application is to use a different kind of collection, a '''FloatNamedArraylist'''. A <var>FloatNamedArraylist</var> uses a number to index the items, but: | |||
Generally, the most useful aspect of FloatNamedArraylists is the first of these items (as it is in our example), but the second can be useful on occasion, too. Using a FloatNamedArraylist, we can now | <ul> | ||
<li>The item numbers don't have to be added sequentially. | |||
<li>Negative and fractional indexes are allowed. | |||
</ul> | |||
Generally, the most useful aspect of <var>FloatNamedArraylists</var> is the first of these items (as it is in our example), but the second can be useful on occasion, too. Using a <var>FloatNamedArraylist</var>, we can now repair the above example: | |||
<p class="code">%cost is floatNamedArraylist of float auto new | |||
... | ... | ||
o: for each occurence of product.no | |||
%cost(product.no) = %cost(product.no) + cost(occurrence in o) | |||
end for </p> | |||
But, this still isn't quite right. While one can set a FloatNamedArraylist item whether or not it currently has an explicitly set value, by default you cannot retrieve a value of an item that was never set. Since the assignment statement above first retrieves this current value of the item being updated, | |||
But, this still isn't quite right. While one can set a <var>FloatNamedArraylist</var> item whether or not it currently has an explicitly set value, by default you cannot retrieve a value of an item that was never set. Since the assignment statement above first retrieves this current value of the item being updated, a request canceling error is guaranteed the first time it's executed. Fortunately, with the addition of a simple method call, the problem can be fixed: | |||
<p class="code">%cost is floatNamedArraylist of float auto new | |||
... | ... | ||
%cost:useDefault = true | |||
o: for each occurence of product.no | |||
%cost(product.no) = %cost(product.no) + cost(occurrence in o) | |||
end for </p> | |||
This setting of | |||
This setting of <code>UseDefault</code> to <code>True</code> means that a reference to an unset item number will return a default value which, for a <var>FloatNamedArraylist</var> is, of course, zero (though it could be set to something else, if needed). In fact, a <var>FloatNamedArraylist</var> provides a lower impact way of converting an <var>Array</var> to a collection than does an <var>Arraylist</var>. A <var>FloatNamedArraylist</var> with <code>UseDefault</code> set to <code>True</code> acts exactly like an <var>Array</var> with essentially no limit on the number of items and no cost for unused item numbers. So, if you had a <var>FloatNamedArraylist</var> with used item numbers 1, 500, and 9999, the <var>FloatNamedArraylist</var> would use no more space than if the used item numbers were 1, 2, and 3. This is what's known as a [http://en.wikipedia.org/wiki/Sparse_array sparse array]. Still, even though a <var>FloatNamedArraylist</var> provides a simpler migration from an <var>Array</var>, it is recommended that in the common case where the array/collection items are sequentially added, an <var>Arraylist</var> be used, as this provides a more natural way of representing the data, even though it might be a <strong>little</strong> more work. | |||
Hopefully, you'll have noticed something unusual in the above code: while <code>UseDefault</code> <strong>looks</strong> like a function or subroutine invocation, it's actually being set. This is something called a '''property''', though it can also be thought of as a variable, too (much as images have variables that can be set). In any case, objects can have properties or variables with values that can be set or retrieved. The syntactic magic that allows the above <code>True</code> might also catch your attention. A UL programmer would guess that it's a field reference, but it's not — it's simply a boolean value. Hopefully, it seems natural enough that, for now, you're willing to accept and understand its meaning it without fully understanding the syntax. | |||
As a last note about <var>FloatNamedArraylists</var>, it is possible to loop through only the items that have actually be set in the request. For example, the following prints the index and value for all set <var>FloatNamedArraylist</var> items in <code>%cost</code>: | |||
<p class="code">%cost is floatNamedArraylist of float | |||
... | |||
for %i from 1 to %cost:count | |||
print %cost:nameByNumber(%i) with ': ' with %cost:itemByNumber(%i) | |||
end for </p> | |||
The "name" of a <var>FloatNamedArraylist</var> item is the index, so if you had updated items 1, 73, and 9999, <code>NameByNumber</code> would return those numbers for input parameters 1, 2, and 3, respectively and, of course, <code>ItemByNumber</code> would return the value to which these items had been set. | |||
The "name" of a FloatNamedArraylist item is the index, so | |||
Hopefully, by this point, you see: | Hopefully, by this point, you see: | ||
<ul> | |||
<li>How easy it is to change from using arrays to collections. | |||
Collections are a big topic and there are many very useful | |||
<li>The power that such a change provides. | |||
</ul> | |||
Collections are a big topic and there are many very useful subtopics that haven't been covered here: | |||
<ul> | |||
<li>The ability to have collections of any datatype such as <var>Strings</var>, <var>Longstrings</var>, or even objects such as <var>Stringlists</var> or other collections. | |||
<li><var>NamedArraylists</var> provide collections where the index is a string. They provide <var class="product">User Language</var> with [http://en.wikipedia.org/wiki/Associative_array associative array] capability. | |||
</ul> | |||
===The magic | ===The magic Item method=== | ||
In | In converting an <var>Array</var> to a collection, it's critical that a collection item can be referenced in a syntactically identical way to an <var>Array</var>, namely with <code>%array(<i>subscript</i>)</code>. Not only is this incredibly useful for such a conversion, but it seems very logical and natural; so much so, that one might not give it a second thought. | ||
However, in the object-oriented paradigm, the contents of an an object are | However, in the object-oriented paradigm, the contents of an an object are <strong>always</strong> accessed by a method or variable name — in <var class="product">SOUL</var> via <code><object>:<member></code> syntax. So, what gives? Well, for collection objects, because the most common operation for a collection and the whole reason for the collection's existance is the ability to extract individual items from the collection, there is a default method if one is not specified explicitly: the <var>Item</var> method. This means that if <code>%foo</code> is an <var>Arraylist</var>, <code>%foo(4)</code> is functionally identical to <code>%foo:item(4)</code>. Which one is used is largely a matter of taste though, again, the fact that <code>%foo(4)</code> seems so natural suggests that there is generally no particularly good reason to explicitly specify <code>Item</code>. | ||
In addition to collections, certain "collection-like" objects such as Stringlists can also use the implied | In addition to collections, certain "collection-like" objects such as <var>Stringlists</var> can also use the implied <var>Item</var> method. <var>Stringlists</var> are not, strictly speaking, collections, since they cannot be made up of arbitrary datatypes (they have no <var>Of</var> clause on their declarations). However, for most purposes, they behave very much like an <var>Arraylist</var> of strings, so they are collection-like. As such, if <code>%sl</code> is a <var>Stringlist</var> object, <code>%sl(%i)</code> is functionally identical to <code>%sl:item(%i)</code>. | ||
==Conclusion== | ==Conclusion== | ||
After reading this page and following its suggestions, you should be much more comfortable with object-oriented programming syntax and have a sense of how objects and object variables are used. This is half the battle in becoming an object-oriented programmer. The other half is writing your own classes (types of objects). But, it is possible to use object-oriented technology without ever writing your own classes. In fact, it is likely that many and maybe even most programmers in other OO languages write few if any of their own classes. | |||
SOUL provides a wide variety of built-in (System) classes to increase the power of <var class="product">User Language</var> programmers and to give them plenty of practice writing OO code. Even if you never write your own classes, it is well worth moving to using object-oriented syntax and facilities as much as possible. Ultimately, you will get comfortable enough with the concepts that you will recognize the situations where you want to write your own classes, and you will have a sense of how those classes should be structured. If and when that happens, you will be ready to take the next step. Until then, you can already consider yourself an object-oriented programmer (though not an expert or "guru"). | |||
==See also== | ==See also== | ||
Line 319: | Line 415: | ||
<li>[[Methods]] | <li>[[Methods]] | ||
<li>[[Object variables]] | <li>[[Object variables]] | ||
<li>[[ | <li>[[Object oriented programming in SOUL]] | ||
</ul> | </ul> | ||
[[Category: | [[Category:SOUL object-oriented programming topics]] | ||
[[Category:SOUL]] |
Latest revision as of 19:33, 16 May 2016
Background
So, you're a User Language programmer and you're thinking about learning object-oriented programming (OOP):
- Rumor has it you can be a more effective programmer if you use object-oriented techniques.
- You're tired of feeling inferior to object-oriented programmers because they speak a language that you don't understand but which sure as heck sounds impressive.
- New User Language/SOUL functionality will use object-oriented syntax.
- "Object-oriented" looks better on your resume than "User Language."
- You just want to learn something new.
Unfortunately, most User Language programmers' first experience with object-oriented programming is painful and bewildering. Often it comes in the form of a VB.Net or Java class where the terminology flows freely from day one. Even worse:
- Classes often emphasize how one architects an object-oriented application. While this might be a logical way to build an application, it's a daunting way to learn a language — like trying to learn ballroom dancing before you know how to walk.
- Teachers (and books about object-oriented programming) are often enamored with the more sophisticated aspects of object-oriented programming languages, leaving in the dust novices still struggling to digest the simpler concepts.
- Many object-oriented concepts are interrelated, so it often requires plowing ahead without fully understanding the concepts one has already learned.
- There was no way to put the concepts learned in a Java or VB.Net class to use in User Language. So one is forced to work on object-oriented programming in one's free time, or the concepts are quickly forgotten.
Fortunately, there is a way you can learn object-oriented programming and apply the principles on-the-job from day one!
One requirement for this is to work at a site where SOUL (thus Model 204 V7.5 or higher) is available (or where at least some of the other Janus products, such as Janus Web Server or Janus Sockets, are available).
Before going further, a word about terminology: In English-speaking (as opposed to American-speaking) countries, object-oriented is usually called "object-orientated." More commonly, object-oriented programming is just called "OO."
Mixed case code
So, let's get started. First, if you're going to do OO programming, you've got to write your code in mixed case. No, there is no technical reason OO code in User Language must be in mixed case, but:
- It is easy enough to do, and even if you don't use OO, it makes your code look more modern.
- The industry consensus is that descriptive, often compound, words are better for variable, function, and subroutine names than terse non-descriptive names. For example, %itemNumber is better than %ITMN. While the latter is easier to type, the former is much easier to read. Using %ITEMNUMBER, on the other hand, clearly blunts some of that readability benefit. %itemNumber is written in what's called CamelCase.
- OO SOUL depends on CamelCase to make function and subroutine names readable.
Fortunately, it's easy to start using mixed case code. Simply start typing your code in mixed case. What can possibly go wrong? If your system manager has set everything up nicely for you, nothing. But, if not, you might have a few glitches that are easy enough to fix:
- If you use the Model 204 editor and type in mixed case code, and it gets converted to upper case when you enter it, you have two options:
- Set *LOWER at command level. But this has the drawback that Model 204 commands now require holding down the shift key, because mixed case Model 204 commands do not work.
- Set the SIREDIT user parameter to X'33' (only the X'01' bit is required, but you may as well set some others) before entering the editor. This causes Model 204 to essentially switch to *LOWER mode before entering the editor. Request that your system manager do this for everyone by setting SIREDIT X'33' in CCAIN.
- If you get compilation errors when you type in mixed case SOUL, it means that the compiler is not running in case-independent mode. To fix this, you have these options:
- Have your system manager set the X'01' bit in the COMPOPT system parameter. This must be done in CCAIN, and it is probably the best option.
- Change the BEGIN or B statement at the start of the request you're working on to use mixed case: that is,
Begin
,begin
, orb
. - If you can't change the start of the program, add the line
Sirius Case ToUpper
(the case of the words doesn't matter) to the start of the procedure you are working on.
The following is an example of a mixed case User Language program that starts with a mixed case Begin:
begin print 'Hello World!' end
The following is an example of an Included procedure that contains a Sirius Case ToUpper
directive:
sirius case toUpper subroutine hello print 'Hello World!' end subroutine
Note that mixed case User Language support is case independent. You can write User Language statements in any case, and you can specify Model 204 variables with any case. The following illustrates a little island of mixed case code in the middle of some uppercase code:
SUBROUTINE FOOBAR(%INPUT IS FLOAT) %MSG IS STRING LEN 32 %TOTAL IS FLOAT ... if %input gt %total then %msg = 'Input value too big' end if ... PRINT %MSG END SUBROUTINE
This code illustrates the fact that you don't have to convert an entire procedure (or request) to mixed case to take advantage of mixed case SOUL. Obviously though, in the long-term, it is a good goal to aim for relatively consistent casing in all your code. In the short term however, some inconsistency will have to be tolerated to get to the point where most or all SOUL code is in mixed case. Certainly, any new procedures should be written completely in mixed case.
This example also demonstrates that you don't have to learn anything new to enter User Language in mixed case (all statements still work the same way), so there is no excuse not to start.
Object-oriented syntax
The most pernicious difference between OO languages and procedural languages such as User Language (called "UL" from here on) is the syntax. And the biggest syntactic difference between OO and UL is how functions and subroutines are invoked. Let's start with functions. All User Language programmers know how to invoke a function.
First, functions are called $functions (dollar-functions) or £functions (pound-functions) in the UK. $functions (except a very few) always return a value, so they must be on the right side of an assignment, input to a subroutine or other $function call, or inside some User Language expression. The following example, has $Substr in all three contexts:
%x = $substr(%y, 3, 10) call clever($substr(%y, %start, %len)) %z = $substr(%y, %len, 1) + 10
As the above example shows, and all User Language programmers know, a $function can be followed by the $function arguments (inputs) in parentheses with multiple arguments separated by commas.
OO functions, on the other hand, use a syntax where a function invocation consists of the thing (object, if you will) that the function is operating on specified before the function name, followed by its arguments inside parentheses. Many $functions have OO equivalents and $Substr is no exception: its OO equivalent is called Substring. The following illustrates the use of the OO Substring function by replacing "$substr" in the previous example with "substring":
%x = %y:substring(3, 10) call clever(%y:substring(%start, %len)) %z = %y:substring(%len, 1) + 10
While this might look strange to a User Language programmer, it uses the most common OO syntax so can honestly be called OO programming. So, if you find any $substr
call in your system and change it to use Substring (moving the first argument before a :substring
), you have now done some OO coding. It's that easy!
To further add to your bona fides as an object oriented programmer, don't call Substring a function, but call it a method. In addition, don't call %y
just a string, call it a string object. Now you're ready for something more advanced. Say "I applied the Substring method to the String object". Repeat until it feels natural to say it. Congratulations, you're well on your way to becoming an OO programmer and, maybe even a guru. If you're curious about what you just said:
- Method is just a fancy word that means function or subroutine or any other called code that does something.
- Object is just a fancy word for a "thingy."
Now, of course, there are many functions available other than just Substring. The functions that operate on strings are called Intrinsic String methods. You can find a list of the methods at List of String methods. There are also functions that operate on numbers. The list of these can be found at List of Float methods.
Now, while congratulating yourself on your new-found skills, you might have a gnawing feeling that you really haven't accomplished a lot, as much as you have learned. So what have you accomplished? Is OO syntax really better than what you're used to? At first blush, it would seem worse, as the traditional $function call is more like English, where verb (function name) precedes the object (first parameter), as opposed to the OO syntax, where the order is reversed. For example, one would say "get the substring of %y" not "%y get the substring," so
%x = $substr(%y, 2, 3)
seems more natural than
%x = %y:substring(2, 3)
.
While this might be true, it's easy enough to get used to the second form — in many languages the object comes before the verb.
In any case, the chief advantage of OO syntax is that because the input of a function is to the left of the function, one can invoke a function and then pass the result of that function to another function by placing the second function call after the first. Similarly, a third function can be placed to the right of second to process its output. To illustrate, consider the following:
%x = %y:substring(%start, %len):toUpper:unspace
This reads rather nicely, left to right: take "%y", get a substring, convert to uppercase, and remove extra blanks. The traditional version doesn't read nearly as nicely:
%x = $unspace($upcase($substr(%y, %start, %len)))
Since the processing happens in inside out order, presumably one should read this code inside out but, of course, this is very difficult. In addition, because only a colon is required as a separator character, and because no dollar-sign is required to distinguish a function from other entities, the OO version is shorter, in spite of the fact that the function names are somewhat longer, and so, more meaningful. Perhaps more important, the OO expression contains fewer "noise" characters. All of these lead to much better readability for the OO expression. And, just as spaces can be placed around parentheses in a $function invocation to improve readability, spaces can be placed around the colon used to separate the object and function name:
%x = %y : substring( %start, %len ): toUpper :unspace
This example used inconsistent spacing just to illustrate what's allowed, not to suggest that inconsistent spacing is recommended — it's not.
So, to continue the process of becoming an object-oriented User Language programmer, you should familiarize yourself with the list of intrinsic String and Float methods, and try to use them wherever possible and in lieu of the $function equivalents. There is no performance penalty for doing so — in some cases OO functions and $functions share the same underlying code.
Named parameters
The SOUL object-oriented implementation has support for named parameters, parameters that can be specified by name, rather than position. While, strictly speaking, this has nothing to do with OO programming (in fact, few OO languages support it — Java and VB.Net do not) named parameters are used in many SOUL methods, so it is important to understand them. To specify a value for a named parameter, simply specify the name, followed by an equals sign (=), followed by the value. For example, the Right and Left String functions have a Pad named parameter that indicates the pad character to be used if the input string is shorter than the requested length:
%x = %y:right(20, pad='0')
In this case, the name simply makes the code clearer, since without the following name, it's far less obvious what the second parameter means:
%x = %y:right(20, '0')
In functions with large numbers of parameters, the named parameters can also be very useful for eliminating the need for placeholder commas, and for making the function invocations more readable. String and Float methods tend not to have a lot of parameters so named parameters are not heavily used for them, but they are used here and there, so it is important to understand them.
Stringlists
The object-oriented extensions to User Language use $lists. Most sites that can use $lists, do use $lists, because many $functions require them as input or outputs, and because they are just generally useful. $lists are essentially objects because an object is a container for information that is accessed via a reference variable, and multiple reference variables can refer to the same object ($list). For example:
%list is float %list2 is float ... %list = $listNew %list2 = %list $listAdd(%list, 'Now is the winter') print $listInf(%list2, 1)
In this example, the Print
statement would end up printing "Now is the winter" even though the $listAdd
added the line to %list
.
Both %list
and %list2
point to the same $list (object) and so, it is not surprising that what is added via %list
can be seen via %list2
. It is clear, too, that %list
and %list2
must be pointers to $list objects and cannot be the objects, themselves, since a Float value couldn't possibly hold the contents of a $list.
An experienced UL programmer might raise an eyebrow at the line that contains the $listAdd
in the above example. $listAdd is a function, so it returns a value. But the $listAdd
is not on the right side of an assignment. What gives? Well, the result of many $functions is usually "uninteresting" to the User language code, so it seems silly to have to find a variable to which to assign it. So SOUL allows certain $functions to be invoked without assigning its result to anything. This causes the $function to behave more like a subroutine as subroutines are called without obtaining a return value from the subroutine. It is this blurring of the distinction between functions and subroutines that leads to the OO term method, which is a convenient way of referring to either.
So when you were using $lists, you were using object-oriented programming capabilities, even though you might not have realized it. However, you were not using object-oriented syntax to access the $list objects, so the benefits of a pure object-oriented facility were not available to you. Now, you might just say "Hold on, didn't you say strings are also objects?" Indeed they are. However, they're objects that can't be changed once they're set (so-called immutable objects) — you can't really change a string that's been assigned to a variable, you can only assign a new string to that variable. Most objects are not immutable, though of course, strings and numbers are heavily used, so they have importance disproportionate to their number. Nevertheless, the thing that distinguishes OO languages from non-OO languages is the presence of changeable (non-immutable) objects, so $lists can be viewed as a step toward object-oriented programming.
The pure object-oriented equivalent of $lists are called Stringlists. And, using a Stringlist is very similar to using a $list. To illustrate, let's take the above example and convert it to using Stringlists:
%list is object stringlist %list2 is object stringlist ... %list = new %list2 = %list %list:add('Now is the winter') print %list2:item(1)
Your first reaction might be an understandable "So what?" Well, let's take a look at what we've accomplished. First, we've made it much clearer what's in %list
and %list2
: references to Stringlists. This prevents accidents such as someone accidentally incrementing the value of a %list — the following would not be allowed:
%list is object stringlist ... %list = %list + 1 <==Invalid
Nor, would the following:
%list is object stringlist %list2 is float ... %list = %list2 <==Invalid
While this might not seem that useful, it can be much more useful in complex applications with lots of different kinds of objects — except for strings and numbers, only object variables of the same "class" can be assigned to each other. "Wait a minute, what's a class?" you say. A class is simply a name for objects with identical characteristics. In this case, all Stringlists have a similar structure and have a fixed set of things you can do with them (like $lists). The distinction between a class and individual objects is the same as in normal usage — my blue Subaru is an object in the class of cars.
In any case, beyond the protection of accidental assignment mismatches, what else has using an OO Stringlist instead of a traditional $list bought us? Well, compare:
$listAdd(%list, 'Now is the winter')
with
%list:add('Now is the winter')
In the former, when the UL compiler hits the $listAdd
, it has no idea that it's working with a $list. So, to distinguish $listAdd
from other functions that might be doing an add of something to something, the word "list" has to be used as a qualifier in the $function name — it would not do to have the $function called simply $add
. But, with the OO syntax, the compiler first hits the %list
so now "knows" that whatever follows must be something specific to a Stringlist. So it is sufficient to use the function name Add
because it can only mean the Add that applies to Stringlists, not to other classes. This means the following:
- The OO statements tend to be shorter, because the extra qualifiers on the function names are unnecessary. This is often true even when the OO statements use longer, more meaningful function names than the non-OO equivalents.
- The absence of the extra qualifiers eliminates a lot of the "noise" in OO statements. Since $lists names usually have the word "list" in them to keep them straight from other variables, and since all the $list functions start with "$list", code that manipulates $lists often turns into the actual functional code swimming in a sea of the word "list".
Consider, for example, code that subsets, sorts, and then prints the contents of a $list:
$list_print($listSort($listSub(%list, 'foo'), '1,10,A')))
Now, consider the OO equivalent:
%list:subset('foo'):sortNew('1,10,A'):print
Not only is the OO version considerably shorter, it is much easier to follow — again, it can be read left to right.
In the $function version, our eyes glaze over at the sea of $list
s and nested parentheses. While, this is an extreme example, it can safely be said that code using OO syntax is almost always tidier and easier to read than the traditional non-OO syntax.
Unfortunately, unlike using the OO syntax for intrinsic (String and Float) data, there is no really good way to simply change lines of code here and there to gradually migrate from using the pseudo-object-orientation of $lists to the full-on "true" object-orientation of Stringlists. Fortunately, many uses of Stringlists are fairly localized, so in many cases, changing an application from using $lists to using Stringlists can be a matter of changing a few dozen lines of code or fewer. While this doesn't provide any game-changing improvements in functionality (there are few things one can do with Stringlists that one can't do with $lists), it will:
- Make code tidier and easier to understand, especially should your organization hire younger programmers who are familiar with object-oriented programming.
- Familiarize you with object-oriented syntax and object behavior.
On this latter point, it is worth mentioning that because Stringlists are true objects, they correct the flakiness that can occur with $lists when the same $listNew (or other $list-creating statement) is executed multiple times. Because the same $listNew statement always returns the same $list identifier (number), situations where a $listNew is excuted multiple times can be buggy or confusing. Executing the same New operation for Stringlists multiple times, on the other hand, returns distinct Stringlist references and so eliminates the bugginess or confusion inherent in the $list equivalent.
It is worth emphasizing that a Stringlist object variable does not contain the Stringlist object any more than a variable used to hold a $list identifier contains the $list itself. In both cases, the variable contains a reference to the underlying object. So assigning one Stringlist object variable to another does not create a copy of the object (it simply copies the reference to the underlying object), just as assigning a $list identifier from one variable to another does not copy the $list itself. Although there is a Copy function for Stringlists:
%list is object stringlist %list2 is object stringlist ... %list2 = %list:copy
The assignment of the references is much more efficient, of course, and in many cases, just what one wants, anyway.
Note: Stringlists are no more or less efficient than $lists; in fact, they share most of the same underlying code.
Also, all Stringlist methods are longstring capable. That is, they a accept and return Longstring values (strings longer than 255 bytes). For historic reasons, the $list API contains a mix of longstring-capable and non-longstring-capable functions. Many of the non-longstring-capable $functions have extra parameters to deal with the fact that, while $list items could be longer than 255 bytes, they could only be processed 255 bytes at a time. The Stringlist API, written well after longstrings became available, didn't need to deal with any of these issues. This means that the $list API is considerably more complicated than the Stringlist API, without providing any additional functionality. This is just another reason to use the Stringlists in lieu of $lists.
So, in keeping with the theme of this page the recommendations are:
- Find some existing $list code that is relatively localized, and convert it to use Stringlist objects.
- Write any new code using Stringlists rather than $lists.
Collections
So far, we have seen how Strings and numbers are simply objects, and can be manipulated using OO syntax without much effort. We've also seen that $lists were really objects, and that $list identifiers behaved very much like object variables, albeit kludgily because there was no formal object support in User Language. As such, with a modicum of effort, an application that uses $lists could be modified to use Stringlists, tidying up the code without any loss of functionality. As our last exercise in learning object-oriented programming we now look at Collections.
Collections are essentially object-oriented replacements for traditional Model 204 arrays, but, as you'll see, with considerably more functionality.
Model 204 arrays, unlike $lists, are most certainly not objects in the OO sense, because Array variables are not references to array objects, they are the arrays themselves. There is no way to have two variables referencing the same array (except in the special case of arrays being passed as parameters), and arrays are created at compile-time, unlike objects (and $lists) which are created at run-time. In addition to making arrays non-OO, these restrictions can be problematic for applications:
- Application coders must determine ahead of time what the largest number of array items will be and allocate space for them, at compile time. This is both:
- Wasteful of server space, as often the maximum likely array size is likely to be much bigger than the commonly required one.
- Insufficient, as invariably any arbitrary limit can and will ultimately be exceeded.
- The inability to have multiple references to the same array can make certain code clumsy and unintuitive.
Collections provide a nice object-oriented alternative to arrays. Like arrays, collection variables are declared as being composed of variables of a specific type. For example, an array of numbers might be declared as:
%numbers is float array(20)
A comparable collection declaration would be something like:
%numbers is collection arraylist of float
The word collection
here means exactly the same thing as "object," but it is used to distinguish collection objects, which have certain special characteristics (such as having an Of clause in the declaration) from other objects. Because collections are so common in OO applications, the keyword collection
is actually optional, and the above declaration could be more simply written as:
%numbers is arraylist of float
Probably, the first thing that jumps out at you is the fact that we don't have to declare (and, in fact, can't) the maximum number of items to be allowed in the Arraylist. This means that Arraylists will only use as much space as is needed for the data at hand, and they will not add a lot to server size requirements for the odd cases where an Arraylist happens to have a lot of items. In fact, collection items are stored in CCATEMP on an as-needed basis, with only the currently referenced item being stored in the server. So, collections have very small server footprints.
One appealing aspect of using an Arraylist is that, by and large, references to Arraylist items are identical to references to Array items. That is, to set %x
to the third item in either an Array or Arraylist, one can simply do:
%x = %numbers(3)
And to set the the %i
'th item in either an Array or an Arraylist, one can do:
%numbers(%i) = %x
So, all one has to do to switch from using an Array to an Arraylist is to simply change the variable declaration? Well, not quite. Suppose we had something like:
%cost is float array(20) %ncost is float ... o: for each occurrence of cost %ncost = %ncost + 1 %cost(%ncost) = value in o end for
Now, we change the above to:
%cost is arraylist of float %ncost is float ... o: for each occurrence of cost %ncost = %ncost + 1 %cost(%ncost) = value in o end for
The problem we hit immediately is that the Array version of code never explicitly created the array — the array is created at compile-time. Arraylists, on the other hand, being objects need to be created at run-time — the OO term for this being instantiation. This is identical to the requirement that a $list be created with a $listNew. Not having created an instance of the Arraylist, we will get a null pointer error in our first reference to the object. We can fix this problem one of two ways. One way is to simply create a new Arraylist instance before using it:
%cost is arraylist of float %ncost is float ... %cost = new o: for each occurrence of cost %ncost = %ncost + 1 %cost(%ncost) = value in o end for
But, it is also possible to tell the OO infrastructure to automatically create a new Arraylist the first time it is referenced:
%cost is arraylist of float auto new %ncost is float ... o: for each occurrence of cost %ncost = %ncost + 1 %cost(%ncost) = value in o end for
While in many instances this is a bad idea (since it can become unclear in the code where an object will be first created or where it's first intended to be created), experience has shown that, at least for collections, the use of Auto New
usually works out fairly well. This is especially the case for applications where an Array was changed to an Arraylist.
But we hit a second problem with the above code: an attempt to set item number %ncost
in the Arraylist in the above example will almost cause request cancellation, because that item won't be there, yet. This is really no different from the Array code where, if you tried to add a 21'st item to the array, you would get a request cancellation because the array was declared to have 20 items. However, because the array is created at compile-time with 20 entries, one can simply set items 1 through 20. When switching the code to use an Arraylist, the code will have to be changed, somewhat:
%cost is arraylist of float auto new %ncost is float ... o: for each occurrence of cost %ncost = %ncost + 1 %cost:add(value in o) end for
While this makes it a bit more trouble to trivially change from using an Array to an Arraylist, often the populating of the array is isolated to one place, so the change is fairly trivial. Even better, it can eliminate the need for a count variable like %ncount
since the code could be simply written as:
%cost is arraylist of float auto new ... o: for each occurrence of cost %cost:add(value in o) end for
"Hold on" you might say. "I need %ncost
later on":
for %i from 1 to %ncost %x = %cost(%i) ... end for
But, this is unnecessary. One nice thing about collections is you always have a Count function that returns the number of items in the collection so the above code could simply be written as:
for %i from 1 to %cost:count %x = %cost(%i) ... end for
In addition to eliminating the need for the %ncost
variable, this eliminates the chance of a bug where %ncost
doesn't accurately reflect the count of items on the Arraylist. Of course, if you have a lot of references to %ncost
, you might prefer not to fix them all up, in spite of the benefits. In this case, you could simply leave the incrementing of %ncost
in the loop that populates the Arraylist or, if you're compulsive about performance, set it after the loop:
o: for each occurrence of cost %cost:add(value in o) end for %ncost = %cost:count
So, hopefully, it's clear that with minimal effort, an application can be changed from using an Array to an Arraylist. This would have the immediate benefit of reducing the server size required by the application while, at the same time, reducing any limit on the number of items one might put into the Arraylist. Even better, a wide variety of Arraylist functions become available once you've made this change. There are functions to make copies of Arraylists, to sort them, to search them, to insert or remove items, to find minima and maxima, and much more. In addition, you have the power of multiple references to the same Arraylist. You can even have Arraylists of Arraylists, if that's useful.
All that said, there might be cases where it's non-trivial to convert from using an Array to an Arraylist. Specifically, your application might depend on the fact that the Array is pre-populated with a specific number of items. For example, suppose in the above example, there was an array of 20 items that corresponded to 20 possible product numbers and total costs are calculated as:
o: for each occurence of product.no %cost(product.no) = %cost(product.no) + cost(occurrence in o) end for
Such an application does not lend itself to using the Arraylist Add method, since the product numbers won't necessarily arrive in numeric order. One approach would be to pre-populate an Arraylist with zeros:
for %i from 1 to 20 %cost:add(0) end for
But this somehow seems a bit ad hoc and dissatisfying. Another approach for such an application is to use a different kind of collection, a FloatNamedArraylist. A FloatNamedArraylist uses a number to index the items, but:
- The item numbers don't have to be added sequentially.
- Negative and fractional indexes are allowed.
Generally, the most useful aspect of FloatNamedArraylists is the first of these items (as it is in our example), but the second can be useful on occasion, too. Using a FloatNamedArraylist, we can now repair the above example:
%cost is floatNamedArraylist of float auto new ... o: for each occurence of product.no %cost(product.no) = %cost(product.no) + cost(occurrence in o) end for
But, this still isn't quite right. While one can set a FloatNamedArraylist item whether or not it currently has an explicitly set value, by default you cannot retrieve a value of an item that was never set. Since the assignment statement above first retrieves this current value of the item being updated, a request canceling error is guaranteed the first time it's executed. Fortunately, with the addition of a simple method call, the problem can be fixed:
%cost is floatNamedArraylist of float auto new ... %cost:useDefault = true o: for each occurence of product.no %cost(product.no) = %cost(product.no) + cost(occurrence in o) end for
This setting of UseDefault
to True
means that a reference to an unset item number will return a default value which, for a FloatNamedArraylist is, of course, zero (though it could be set to something else, if needed). In fact, a FloatNamedArraylist provides a lower impact way of converting an Array to a collection than does an Arraylist. A FloatNamedArraylist with UseDefault
set to True
acts exactly like an Array with essentially no limit on the number of items and no cost for unused item numbers. So, if you had a FloatNamedArraylist with used item numbers 1, 500, and 9999, the FloatNamedArraylist would use no more space than if the used item numbers were 1, 2, and 3. This is what's known as a sparse array. Still, even though a FloatNamedArraylist provides a simpler migration from an Array, it is recommended that in the common case where the array/collection items are sequentially added, an Arraylist be used, as this provides a more natural way of representing the data, even though it might be a little more work.
Hopefully, you'll have noticed something unusual in the above code: while UseDefault
looks like a function or subroutine invocation, it's actually being set. This is something called a property, though it can also be thought of as a variable, too (much as images have variables that can be set). In any case, objects can have properties or variables with values that can be set or retrieved. The syntactic magic that allows the above True
might also catch your attention. A UL programmer would guess that it's a field reference, but it's not — it's simply a boolean value. Hopefully, it seems natural enough that, for now, you're willing to accept and understand its meaning it without fully understanding the syntax.
As a last note about FloatNamedArraylists, it is possible to loop through only the items that have actually be set in the request. For example, the following prints the index and value for all set FloatNamedArraylist items in %cost
:
%cost is floatNamedArraylist of float ... for %i from 1 to %cost:count print %cost:nameByNumber(%i) with ': ' with %cost:itemByNumber(%i) end for
The "name" of a FloatNamedArraylist item is the index, so if you had updated items 1, 73, and 9999, NameByNumber
would return those numbers for input parameters 1, 2, and 3, respectively and, of course, ItemByNumber
would return the value to which these items had been set.
Hopefully, by this point, you see:
- How easy it is to change from using arrays to collections.
- The power that such a change provides.
Collections are a big topic and there are many very useful subtopics that haven't been covered here:
- The ability to have collections of any datatype such as Strings, Longstrings, or even objects such as Stringlists or other collections.
- NamedArraylists provide collections where the index is a string. They provide User Language with associative array capability.
The magic Item method
In converting an Array to a collection, it's critical that a collection item can be referenced in a syntactically identical way to an Array, namely with %array(subscript)
. Not only is this incredibly useful for such a conversion, but it seems very logical and natural; so much so, that one might not give it a second thought.
However, in the object-oriented paradigm, the contents of an an object are always accessed by a method or variable name — in SOUL via <object>:<member>
syntax. So, what gives? Well, for collection objects, because the most common operation for a collection and the whole reason for the collection's existance is the ability to extract individual items from the collection, there is a default method if one is not specified explicitly: the Item method. This means that if %foo
is an Arraylist, %foo(4)
is functionally identical to %foo:item(4)
. Which one is used is largely a matter of taste though, again, the fact that %foo(4)
seems so natural suggests that there is generally no particularly good reason to explicitly specify Item
.
In addition to collections, certain "collection-like" objects such as Stringlists can also use the implied Item method. Stringlists are not, strictly speaking, collections, since they cannot be made up of arbitrary datatypes (they have no Of clause on their declarations). However, for most purposes, they behave very much like an Arraylist of strings, so they are collection-like. As such, if %sl
is a Stringlist object, %sl(%i)
is functionally identical to %sl:item(%i)
.
Conclusion
After reading this page and following its suggestions, you should be much more comfortable with object-oriented programming syntax and have a sense of how objects and object variables are used. This is half the battle in becoming an object-oriented programmer. The other half is writing your own classes (types of objects). But, it is possible to use object-oriented technology without ever writing your own classes. In fact, it is likely that many and maybe even most programmers in other OO languages write few if any of their own classes.
SOUL provides a wide variety of built-in (System) classes to increase the power of User Language programmers and to give them plenty of practice writing OO code. Even if you never write your own classes, it is well worth moving to using object-oriented syntax and facilities as much as possible. Ultimately, you will get comfortable enough with the concepts that you will recognize the situations where you want to write your own classes, and you will have a sense of how those classes should be structured. If and when that happens, you will be ready to take the next step. Until then, you can already consider yourself an object-oriented programmer (though not an expert or "guru").