Datetime string formats: Difference between revisions
m (→CENTSPAN) |
m (→SPANSIZE) |
||
Line 335: | Line 335: | ||
starting at CENTSPAN+90. | starting at CENTSPAN+90. | ||
From our previous example: | From our previous example: | ||
<center> | |||
[[File:Spam2.jpg|573px]] | |||
</center> | |||
An attempt to represent the values "37" through "46" will | An attempt to represent the values "37" through "46" will | ||
be rejected. | be rejected. | ||
Line 346: | Line 349: | ||
However, any setting of SPANSIZE less than 100 will probably detect the | However, any setting of SPANSIZE less than 100 will probably detect the | ||
case where a range greater than one hundred years is being used. | case where a range greater than one hundred years is being used. | ||
==Strict and non-strict format matching== | ==Strict and non-strict format matching== | ||
Revision as of 16:20, 19 March 2011
Datetime Formats
The string representation of a date is described by a datetime format. This value is a character string, composed of the concatenation of tokens (e.g., "YYYY" for a 4 digit year, and "MI" for minutes) and separator characters (e.g., "/" in "MM/DD/YY" for two-digit month, day, and year separated by slashes).
These datetime format strings are used in many methods, commands, and $functions. Since all methods that operate on strings are functions, methods and $functions are collectively referred to as functions on this page.
The rules for these datetime format strings are consistent throughout all Sirius products, though certain uses of these strings might impose extra restrictions. For example, a leading blank may match an HH, DD, or MM token in Janus SOAP User Language Interface methods but it may not in some cases in other Sirius products.
There are certain rules applied to determine if a format is valid. The basic rules are:
- If a format string contains a numeric datetime token (i.e. "ND", "NM", or "NS"), then the format string must consist of only one token.
- You must specify at least one time, weekday, or date .
- Except for "weekday", you can't specify redundant information.
More specifically this means
- Except for "I", no token can be specified twice.
- At most one year format (contains Y) can be specified.
- At most one month format (contains MON, Mon, or MM) can be specified.
- At most one day format (DD or Day) can be specified.
- At most one weekday format (WKD, Wkd, WKDAY, or Wkday) can be specified.
- If AM is specified, then PM can not be specified.
- At most one fractions-of-a-second format (contains X) can be specified.
- If DDD is specified, then neither a day nor month format can be.
- If ZYY is specified in a format string, no other token that denotes a variable-length value may be used.
- If a format string contains other tokens that denote variable length values, then an * token may only appear as the last character of the format string.
- The DAY token may not be immediately followed by another token whose value may be numeric, regardless of whether the following token repsents a variable length value. Thus, DAY may not be followed by *, I, YY, YYYY, CYY, MM, HH, MI, SS, X, XX, or XXX; DAY may not be followed by a decimal digit separator, and DAY may not be followed by a quote followed by a decimal digit.
- The maximum length of a format string is 100 characters.
Note:
A common mistake is to use "MM" for minutes; it should be "MI".
Datetime format tokens
The valid tokens in a date format are shown in the following list. In general, the output format rule for a token is shown. For some of the functions the input format rule for a token is the same as the output format rule; this is the definition of "strict date format matching". However, non-strict functions sometimes allow a string to match a token on input that would not be produced by that token on output.
All of the tokens which match alpabetic strings (e.g., "MON") match any case for non-strict matching.
* | Ignore entire variable-length substring matching pattern, if any, only when retrieving a date value. Substitute with null string when only creating a date value. When copying date values, copy entire variable-length substring matching pattern, if any, from input value to location identified by * token in output string. |
---|---|
I | Ignore corresponding input character when only retrieving a date value. Store a blank in corresponding output character when only creating a date value. When copying date values, copy each character matching an I token from from the input value to location in the output string identified by the corresponding I token in the output format. |
" | Following character is "quoted", that is, it acts as a separator character. |
YYYY | 4 digit year |
YY | 2 digit year |
CYY | Year minus 1900 (3 digits, including any leading zero). |
ZYY | Year minus 1900, two-digit or three-digit year number, excluding any leading zero (variable length data). Non-strict methods and $functions allow a three-digit number with leading zero on input, but any number less than 100 always produces a two-digit number on output. |
MONTH | Full month name (upper case variable length). When used as an argument to a method or $function for converting from a string, this is the same as Month. Non-strict functions allow any mixture of upper and lower case on input, but all upper case is always produced on output. |
Month | Full month name (mixed case variable length). When used as an argument to a function for converting from a string, this is the same as MONTH. Non-strict functions allow any mixture of upper and lower case on input, but initial upper case letter followed by all lower case is always produced on output. |
MON | Three character month abbreviation (upper case). When used as an argument to a function for converting from a string, this is the same as Mon. Non-strict functions allow any mixture of upper and lower case on input, but all upper case is always produced on output. |
Mon | Three character month abbreviation (mixed case). When used as an argument to a function for converting from a string, this is the same as MON. Non-strict functions allow any mixture of upper and lower case on input, but initial upper case letter followed by all lower case is always produced on output. |
MM | Two-digit month number. When used as an argument to a function for converting from a string, this is the same as BM (leading blank is allowed). Non-strict functions allow a two-character number with leading blank on input, but two decimal digits are always produced on output. |
BM | Two-character month number. When used as an argument to a function for converting from a string, this is the same as MM. If less than 10, first character is blank. Non-strict functions allow a two-digit number with leading zero on input, but any number less than 10 always produces a blank followed by a decimal digit on output. |
DDD | Three digit Julian day number |
DD | Two-digit day number. When used as an argument to a function for converting from a string, this is the same as BD (leading blank is allowed). Non-strict functions allow a two-character number with leading blank on input, but two decimal digits are always produced on output. |
BD | Two-character day number. When used as an argument to a function for converting from a string, this is the same as DD. If less than 10, first character is blank. Non-strict functions allow a two-digit number with leading zero on input, but any number less than 10 always produces a blank followed by a decimal digit on output. |
DAY | One-digit or two-digit day number (variable length data). Non-strict functions allow a two-digit number with leading zero on input, but any number less than 10 always produces a one-digit number on output. |
WKDAY | Full day of week name (upper case variable length). When used as an argument to a function for converting from a string, this is the same as Wkday. Non-strict functions allow any mixture of upper and lower case on input, but all upper case is always produced on output. |
Wkday | Full day of week name (mixed case variable length). When used as an argument to a function for converting from a string, this is the same as WKDAY. Non-strict functions allow any mixture of upper and lower case on input, but initial upper case letter followed by all lower case is always produced on output. |
WKD | Three character day of week abbreviation (upper case). When used as an argument to a function for converting from a string, this is the same as Wkd. Non-strict functions allow any mixture of upper and lower case on input, but all upper case is always produced on output. |
Wkd | Three character day of week abbreviation (mixed case).
When used as an argument to a function for converting from a string, this is the same as WKD. Non-strict functions allow any mixture of upper and lower case on input, but initial upper case letter followed by all lower case is always produced on output. |
HH | Two-digit hour number. Non-strict functions allow a two-character number with leading blank on input, but two decimal digits are always produced on output. When used as an argument to a function for converting from a string, this is the same as BH (leading blank is allowed). |
BH | Two-digit hour number. When used as an argument to a function for converting from a string, this is the same as HH. If less than 10, first character is blank. Non-strict functions allow a two-digit number with leading zero on input, but any number less than 10 always produces a blank followed by a decimal digit on output. |
MI | Two-digit minute number |
SS | Two-digit second number |
X | Tenths of a second |
XX | Hundredths of a second |
XXX | Thousandths of a second (milliseconds) |
AM | AM/PM indicator |
PM | AM/PM indicator |
The valid separators in a date format are:
- blank (" ")
- apostrophe ("'")
- slash ("/")
- colon (":")
- hyphen ("-")
- back slash ("\")
- period (".")
- comma (",")
- underscore ("_")
- left parenthesis ("(")
- right parenthesis (")")
- plus ("+")
- vertical bar ("|")
- equals ("=")
- ampersand ("&")
- at sign ("@")
- sharp ("#")
- the decimal digits ("0" - "9").
In addition, any character may be a separator character if preceeded by the quoting character (").
Valid Datetimes
For a datetime string to be valid it must meet the following criteria:
- Its length must be less than 128 characters.
- It must be compatible with its corresponding format string.
- It must represent a valid date and/or time. For example, at most 23:59:59.999 for a time, 01-12 for a month, 01-31 or less (depending on the month) for a day, February 29 is only valid in leap years (only centuries divisible by 4 are leap years: 2000 is but neither 1800, 1900, nor 2100 are). Note: weekdays are not checked for consistency against the date; e.g., both Saturday, 02/15/97 and Friday, 02/15/97 are valid.
- It must be within the date range allowed for the corresponding format. A datetime string used with a CYY or ZYY format can only represent dates from 1900 to 2899, inclusive. A datetime string used with a YY format can only represent dates in a range of 100 or less years, as determined by CENTSPAN and SPANSIZE. The valid range of dates for all other formats is from 1 January 1753 thru 31 December 9999.
Processing Dates With Two-Digit Year Values
A date field with only two digits for the year value is capable of representing a range of up to one hundred years. When we compare a pair of two-digit year values we are accustomed to thinking of the century as fixed, so that all dates are either "19xx" or "20xx". However, a date field with two-digit year values can actually represent dates from two different centuries, provided that the range of dates does not exceed 100 years.
CENTSPAN
CENTSPAN provides a mechanism for unambiguously converting dates with two-digit year values into dates with four-digit year values. The CENTSPAN mechanism allows two-digit year values to span two centuries without confusion. CENTSPAN identifies the four-digit year value that is the start of a range of years represented by the two-digit year values.
CENTSPAN may be specified as an absolute unsigned four digit value between 1753 and 9999, or it may be specified as a relative signed value between -99 and +99, inclusive. A relative CENTSPAN value is dynamically converted to an effective absolute value before it is used to perform a YY to YYYY conversion. The effective CENTSPAN value is formed by adding the relative CENTSPAN to the current four-digit year value at the time the relative value is converted.
A simple algorithm is used to convert a two-digit year value (YY) to a four-digit year value, using a four-digit absolute or effective CENTSPAN value (HHLL). If the two-digit year value is less than the low-order two digits of the CENTSPAN value, then the resulting century is one greater than the high-order two digits of the CENTSPAN value. Otherwise the resulting century is the same as the high-order two digits of the CENTSPAN value.
Using all one hundred available years for mapping two-digit year values can cause significant confusion and result in data integrity errors. This is because dates just above and just below the 100-year window are mapped to the other end of the window. From our previous example, the date "47" will be intepreted as 1947, when it could have conceivably been 2047. Simlarly, the date "46" will be intepreted as 2046, when it might have been 1946.
If CENTSPAN is set to a value that is too high, dates that are just prior to CENTSPAN will appear to occur 100 years hence. If CENTSPAN is set to a value that is too low, dates that fall just after CENTSPAN+99 will appear to have occured 100 years earlier. A full one-hundred year window also can not detect attempts to represent more than one hundred years of values with a two digit year.
SPANSIZE
Sirius has devised a method to protect from the ambiguities that can occur at each end of the 100-year window defined by CENTSPAN. SPANSIZE is used to restrict the size of the window used for mapping two-digit year values. The effect is to create two guard bands', one just below the date window and one just above. An attempt to represent a date value that lands in a guard band produces an error.
Each guard band contains CENTSPAN-SPANSIZE years, hence a SPANSIZE of 100 removes the protection. The default SPANSIZE is a value which you can customize in your load module; If you do not customize it, the value of SPANSIZE is 90, which provides protection for two ten year windows: one below the CENTSPAN setting and one starting at CENTSPAN+90. From our previous example:
An attempt to represent the values "37" through "46" will be rejected. This protects the range 1937 through 1946 as well as the range 2037 through 2046. Note that an intended value of 2047, expressed as "47" will be accepted and interpreted as 1947. In general a smaller SPANSIZE provides the highest assurance of correct mappings. However, any setting of SPANSIZE less than 100 will probably detect the case where a range greater than one hundred years is being used.
Strict and non-strict format matching
As mentioned above, for some of the functions, the input format rule for a token is the same as the output format rule; this is the definition of "strict date format matching". However, non-strict functions sometimes allow a string to match a token on input that would not be produced by that token on output. The types of strict matching are as follows:
Alpha tokens | For alphabetic tokens (e.g., Month), a strict match requires the input value to be the correct case.
For example, the "MON" token is strictly matched by "JAN" but not by "Jan", and the reverse is true for the "Mon" token. For non-strict matching, the alpabetic tokens are matched by any combination of upper and lower case input. |
---|---|
HH, MM, DD | For these tokens, a strict match requires a leading zero for values less than 10. For non-strict matching, a value less than 10 can also be represented by a leading blank followed by a single numeric digit. |
BH, BM, BD | For these tokens, a strict match requires a leading blank for values less than 10. For non-strict matching, a value less than 10 can also be represented by a leading zero followed by a numeric digit. |
DAY | For this token, a strict match requires a single digit for values less than 10. For non-strict matching, a value less than 10 can also be represented by a leading zero followed by a numeric digit. |
ZYY | For this token, a strict match requires two digits for values less than 100. For non-strict matching, a value less than 100 can also be represented by a leading zero followed by a two numeric digits. |
Since the strict functions are only available in the Sir2000 User Language Tools, if you wish to check a datetime string using strict rules, you can use the following technique with the non-strict date functions:
if <date> eq or <date> ne <date>:stringToMilliseconds(<format>):millisecondsToString(<format>) then <error handling> end if