Deflate (String function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
ELowell (talk | contribs)
No edit summary
Tom (talk | contribs)
No edit summary
Line 19: Line 19:
<li>With dynamic compression (<code>fixedCode=false</code>), the compression code tables are generated based on the input data. Dynamic tables typically provide somewhat better compression on most types of data. There is a very slight CPU overhead in computing the frequencies of byte values in the input data. Also, since the code tables are dynamic, they are included as part of the compressed data. This will increase the size of the compressed longstring, but these tables are small, since they are also stored in a compressed form.
<li>With dynamic compression (<code>fixedCode=false</code>), the compression code tables are generated based on the input data. Dynamic tables typically provide somewhat better compression on most types of data. There is a very slight CPU overhead in computing the frequencies of byte values in the input data. Also, since the code tables are dynamic, they are included as part of the compressed data. This will increase the size of the compressed longstring, but these tables are small, since they are also stored in a compressed form.
</ul>
</ul>
The default value for this argument is <var>False</var> (use dynamic compression).</td></tr>
The default value for this argument is <var>False</var> (use dynamic compression).
</td></tr>


<tr><th><var>LazyMatch</var></th>
<tr><th><var>LazyMatch</var></th>
<td><var>LazyMatch</var> is an optional, name required, parameter that is a <var>Boolean</var> value that specifies whether to use "lazy match" compression, as specified in RFC 1951.
<td><var>LazyMatch</var> is an optional, name required, parameter that is a <var>Boolean</var> value that specifies whether to use "lazy match" compression, as specified in RFC 1951.
The default value for this argument is <var>False</var> (do not use "lazy match" compression).</td></tr>
The default value for this argument is <var>False</var> (do not use "lazy match" compression).
<p><b>Note:</b> This parameter has no effect when hardware compression is used (see [[#Hardware vs. software compression|Hardware vs. software compression]]).</p>
</td></tr>


<tr><th><var>MaxChain</var></th>
<tr><th><var>MaxChain</var></th>
<td><var>MaxChain</var> is an optional, name required, parameter that is a numeric value that specifies the maximum hash chain length, as explained in RFC 1951.
<td><var>MaxChain</var> is an optional, name required, parameter that is a numeric value that specifies the maximum hash chain length, as explained in RFC 1951.
The default value for this argument is 0. If specified, it must be between 0 and 99, inclusive.</td></tr>
The default value for this argument is 0. If specified, it must be between 0 and 99, inclusive.
<p><b>Note:</b> This parameter has no effect when hardware compression is used (see [[#Hardware vs. software compression|Hardware vs. software compression]]).</p>
</td></tr>
</table>
</table>


Line 33: Line 38:
<ul>
<ul>


<li>The <var>[[NCMPBUF parameter|NCMPBUF]]</var> parameter must be set to a
<li>The <var>[[NCMPBUF parameter|NCMPBUF]]</var> parameter must be set to a non-0 value during <var class="product">Model 204</var> initialization to allow use of the <var>Deflate</var> function; otherwise, invoking <var>Deflate</var> causes request cancellation.
non-0 value during <var class="product">Model 204</var>initialization to allow
use of the <var>Deflate</var> function; otherise, invoking <var>Deflate</var>
causes request cancellation.


<li>As with any compression scheme, it is possible that a particular string will become longer after compression. This would happen, for example, if a deflated string were passed to <var>Deflate</var>.
<li>As with any compression scheme, it is possible that a particular string will become longer after compression. This would happen, for example, if a deflated string were passed to <var>Deflate</var>.
Line 42: Line 44:
<li>Short strings (less than 128 bytes) typically compress better with <code>fixedCode=true</code>.
<li>Short strings (less than 128 bytes) typically compress better with <code>fixedCode=true</code>.
</ul>
</ul>
==Hardware vs. software compression==
<p>[Introduced in Model 204 version 8.0]</p>
<p>On IBM z15 and above processors, <var>Deflate</var> automatically uses the DFLTCC (Deflate Conversion Call) hardware instruction to perform compression. On older processors without DFLTCC support, the existing software implementation is used. The choice is made automatically and is transparent to the application.</p>
<p>There are some behavioral differences between hardware and software compression that should be noted:</p>
<ul>
<li>The <var>LazyMatch</var> and <var>MaxChain</var> parameters are <b>ignored</b> when hardware compression is used. The DFLTCC hardware uses its own internal compression strategy for match searching, and these parameters have no effect. The <var>FixedCode</var> parameter <b>is</b> honored: when <code>fixedCode=true</code>, the hardware uses fixed Huffman codes; when <code>fixedCode=false</code> (the default), the hardware generates a Dynamic Huffman Table (DHT) from the input data for each deflate block.</li>
<li>Hardware compression is <b>non-deterministic</b>: compressing the same input data multiple times may produce different compressed output, even with identical parameters. This is because the DFLTCC instruction uses an implementation-dependent hash function for duplicate string matching, as documented in the z/Architecture Principles of Operation. The compressed output is always valid RFC 1951 data and can be decompressed by <var>[[Inflate (String function)|Inflate]]</var> regardless, but applications should not depend on a specific compressed length or byte-for-byte reproducibility of compressed data.</li>
<li>Hardware compression generally provides significantly better throughput than software compression, particularly for larger input data.</li>
</ul>
<p>These differences also apply to the <var>[[Gzip (String function)|Gzip]]</var> and <var>[[Zip (String function)|Zip]]</var> methods, which use <var>Deflate</var> compression internally.</p>


==Examples==
==Examples==
Line 51: Line 70:
<ul>
<ul>
<li><var>[[Inflate (String function)|Inflate]]</var> is used to decompress the string to its original value.
<li><var>[[Inflate (String function)|Inflate]]</var> is used to decompress the string to its original value.
<li>Other related methods:  
<li>Other related methods:
<ul>
<ul>
<li><var>[[Gunzip (String function)|Gunzip]]</var>
<li><var>[[Gunzip (String function)|Gunzip]]</var>

Revision as of 13:03, 19 May 2026

Compress a longstring with deflate (String class)

[Introduced in Sirius Mods 7.4]

This function compresses a longstring using the "deflate" algorithm. The deflate algorithm is described completely in RFC 1951. It is very effective with HTML and XML data.

Syntax

%outString = string:Deflate[( [FixedCode= boolean], [LazyMatch= boolean], - [MaxChain= number])]

Syntax terms

%outString The resulting compressed string.
string The string to be compressed.
FixedCode FixedCode is an optional, name required, parameter that is a Boolean value that specifies whether the compression uses fixed codes or is dynamic, based on the contents of the input string.
  • With fixed code compression (fixedCode=true), the tables used for compression (defined as part of RFC 1951) are somewhat optimized for ASCII character data, but they slightly decrease the amount of CPU required to perform compression. Also, since the codes are already defined as part of the specification, they are not included in the compressed data.
  • With dynamic compression (fixedCode=false), the compression code tables are generated based on the input data. Dynamic tables typically provide somewhat better compression on most types of data. There is a very slight CPU overhead in computing the frequencies of byte values in the input data. Also, since the code tables are dynamic, they are included as part of the compressed data. This will increase the size of the compressed longstring, but these tables are small, since they are also stored in a compressed form.

The default value for this argument is False (use dynamic compression).

LazyMatch LazyMatch is an optional, name required, parameter that is a Boolean value that specifies whether to use "lazy match" compression, as specified in RFC 1951.

The default value for this argument is False (do not use "lazy match" compression).

Note: This parameter has no effect when hardware compression is used (see Hardware vs. software compression).

MaxChain MaxChain is an optional, name required, parameter that is a numeric value that specifies the maximum hash chain length, as explained in RFC 1951.

The default value for this argument is 0. If specified, it must be between 0 and 99, inclusive.

Note: This parameter has no effect when hardware compression is used (see Hardware vs. software compression).

Usage notes

  • The NCMPBUF parameter must be set to a non-0 value during Model 204 initialization to allow use of the Deflate function; otherwise, invoking Deflate causes request cancellation.
  • As with any compression scheme, it is possible that a particular string will become longer after compression. This would happen, for example, if a deflated string were passed to Deflate.
  • Short strings (less than 128 bytes) typically compress better with fixedCode=true.

Hardware vs. software compression

[Introduced in Model 204 version 8.0]

On IBM z15 and above processors, Deflate automatically uses the DFLTCC (Deflate Conversion Call) hardware instruction to perform compression. On older processors without DFLTCC support, the existing software implementation is used. The choice is made automatically and is transparent to the application.

There are some behavioral differences between hardware and software compression that should be noted:

  • The LazyMatch and MaxChain parameters are ignored when hardware compression is used. The DFLTCC hardware uses its own internal compression strategy for match searching, and these parameters have no effect. The FixedCode parameter is honored: when fixedCode=true, the hardware uses fixed Huffman codes; when fixedCode=false (the default), the hardware generates a Dynamic Huffman Table (DHT) from the input data for each deflate block.
  • Hardware compression is non-deterministic: compressing the same input data multiple times may produce different compressed output, even with identical parameters. This is because the DFLTCC instruction uses an implementation-dependent hash function for duplicate string matching, as documented in the z/Architecture Principles of Operation. The compressed output is always valid RFC 1951 data and can be decompressed by Inflate regardless, but applications should not depend on a specific compressed length or byte-for-byte reproducibility of compressed data.
  • Hardware compression generally provides significantly better throughput than software compression, particularly for larger input data.

These differences also apply to the Gzip and Zip methods, which use Deflate compression internally.

Examples

In the following example, %out is set to the compressed version of the given string:

%out = 'How much wood could a woodchuck chuck':deflate(fixedCode=true)

See also