Announcement

Collapse

Forum Guidelines

This forum is for finished source code that is working properly. If you have questions about this or any other source code, please post it in one of the Discussion Forums, not here.
See more
See less

LZSS

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • LZSS

    somehow don already posted similar code -
    i decided to increase a little compression speed.

    i have no idea, why don named this algo "lz77".
    if somebody interesting, you can find brief description at http://www.rasip.fer.hr/research/com...hms/index.html

    unf. lzss compression is not fast by definition (and this fact not directly depends of implementation) and commercial archivators use this algo in combination with arithmetic coding only.

    Code:
       ' 6.11
    
       #compile exe
       #dim all
       #register none
       #include "win32api.inc"
    
       $sourcefile = "c:\system.1st" ' <--- change
       $targetfile = "c:\system.lz
       '=============================================================================
    
       %lzss_window_size           = 4096
       %lzss_match_length_maximum  =   18
       %lzss_match_length_minimum  =    3
    
       function compress_lzss (inbuf as string, outbuf as string) as long
    
          dim inbufsize as local dword
          dim inbufcuraddr as local byte ptr
          dim inbufendaddr as local dword
    
          dim outbufsize as local dword
          dim outbufcuraddr as local byte ptr
          dim outbufendaddr as local dword
          dim outbufnewaddr as local byte ptr
    
          dim wndbuf as local string
          dim wndbufbaseaddr as local byte ptr
          dim wndbufpos as local word
          dim wndent(&hffff??) as local word
          dim wndref(%lzss_window_size) as local word
          dim wndsch as word
          dim wndbufpost as local word
          dim wndbufposl as local word
          dim wndbufposm as local word
          dim wndbufposc as local word
          dim wndbufposa as local dword
    
          dim wndsublen as local word
          dim wndsuboff as local word
          dim wndsubbestlen as local word
          dim wndsubbestoff as local word
    
          dim maxsearchchars as local dword
          dim lenandoffset as local word
    
          dim bitmask    as local byte
          dim bitmaskadd as local byte
    
          inbufsize = len(inbuf): if inbufsize < %lzss_match_length_minimum then _
             outbuf = ": function = -1: exit function ' nothing to compress
          inbufcuraddr = strptr(inbuf)
          inbufendaddr = inbufcuraddr + inbufsize
    
          outbufsize = inbufsize ' maximum
          outbuf = space$(outbufsize)
          outbufcuraddr = strptr(outbuf)
          outbufendaddr = inbufcuraddr + outbufsize
    
          wndbuf = space$(%lzss_window_size + %lzss_match_length_maximum - 1)
          wndbufbaseaddr = strptr(wndbuf) - 1
          wndbufpos = 1
    
          do
             if inbufcuraddr >= inbufendaddr then exit do
             outbufnewaddr = outbufcuraddr + 1: if cdwd(outbufendaddr - outbufcuraddr) < 17 then _
                outbuf = ": function = -1: exit function ' 17 = 2 * 8 + 1
             bitmask = 0: bitmaskadd = 1
             do
                wndsubbestlen = 2
                maxsearchchars = inbufendaddr - inbufcuraddr
                if maxsearchchars > %lzss_match_length_maximum then maxsearchchars = %lzss_match_length_maximum
    
                wndsch = @inbufcuraddr + @inbufcuraddr[1] * 256
                wndbufpost = wndent(wndsch)
    
                while wndbufpost
                   if @inbufcuraddr[wndsubbestlen] = @wndbufbaseaddr[wndbufpost + wndsubbestlen] then
                      do
                         wndsublen = wndsubbestlen
                         for wndbufposa = wndsubbestlen to maxsearchchars - 1
                             if @inbufcuraddr[wndbufposa] <> @wndbufbaseaddr[wndbufpost + wndbufposa] then exit for
                             incr wndsublen
                         next
                         if wndsublen > wndsubbestlen then
                            for wndbufposa = 2 to wndsubbestlen - 1
                                if @inbufcuraddr[wndbufposa] <> @wndbufbaseaddr[wndbufpost + wndbufposa] then exit do
                            next
                            if (wndbufpost + wndsublen) <= (%lzss_window_size + 1) then _
                               wndsubbestlen = wndsublen: wndsubbestoff = wndbufpost
                         end if
                         exit do
                      loop
                   end if
                   wndbufpost = wndref(wndbufpost)
                wend
    
                if wndsubbestlen < %lzss_match_length_minimum then
                   bitmask = bitmask + bitmaskadd
                   @outbufnewaddr = @inbufcuraddr: incr outbufnewaddr
                   wndsubbestlen = 1
    
                else
                   lenandoffset = (wndsubbestlen - %lzss_match_length_minimum) * &h1000 + (wndsubbestoff - 1)
                   @outbufnewaddr = lobyt(lenandoffset): incr outbufnewaddr
                   @outbufnewaddr = hibyt(lenandoffset): incr outbufnewaddr
                end if
    
                for wndbufposa = 1 to wndsubbestlen
                   if wndbufpos > %lzss_window_size then wndbufpos = 1
    
                   for wndbufposc = 0 to 1
                       wndbufposm = wndbufpos - wndbufposc
                       if wndbufposm then
                          wndsch = @wndbufbaseaddr[wndbufposm] + @wndbufbaseaddr[wndbufposm + 1] * 256
    
                          wndbufpost = wndent(wndsch)
                          wndbufposl = 0
                          while wndbufpost
                             if wndbufpost = wndbufposm then
                                if wndbufposl = 0 then wndent(wndsch) = wndref(wndbufposm) else _
                                   wndref(wndbufposl) = wndref(wndbufposm)
                                wndref(wndbufposm) = 0: exit do
                             end if
                             wndbufposl = wndbufpost
                             wndbufpost = wndref(wndbufpost)
                         wend
                      end if
                   next
    
                   @wndbufbaseaddr[wndbufpos] = @inbufcuraddr
    
                   for wndbufposc = 0 to 1
                       wndbufposm = wndbufpos - wndbufposc
                       if wndbufposm then
                          wndsch = @wndbufbaseaddr[wndbufposm] + @wndbufbaseaddr[wndbufposm + 1] * 256
                          wndref(wndbufposm) = wndent(wndsch)
                          wndent(wndsch) = wndbufposm
                       end if
                   next
                   incr wndbufpos
                   incr inbufcuraddr
                next
    
                if inbufcuraddr = inbufendaddr then exit loop
                if bitmaskadd = 128 then exit do
                bitmaskadd = bitmaskadd + bitmaskadd
             loop
             @outbufcuraddr = bitmask
             outbufcuraddr = outbufnewaddr
          loop
    
          outbuf = mkdwd$(inbufsize) + left$(outbuf, outbufcuraddr - strptr(outbuf))
    
       end function
    
       '=================================================================
    
       function decompress_lzss (inbuf as string, outbuf as string) as long
          dim inbufsize as local dword
          dim inbufcuraddr as local byte ptr
          dim inbufendaddr as local dword
    
          dim outbufsize as local dword
          dim outbufcuraddr as local byte ptr
          dim outbufendaddr as local dword
    
          dim wndbuf as local string
          dim wndbufstartaddr as local byte ptr
          dim wndbufcuraddr as local byte ptr
          dim wndbufendaddr as local dword
          dim wndbufsubstraddr as local byte ptr
    
          dim wndsuboff as local word
          dim wndsublen as local word
          dim wndbufposa as local word
    
          dim lenandoffset as local word
    
          dim bitmask    as local byte
          dim bitmaskadd as local byte
    
          inbufsize = len(inbuf)
          if inbufsize < 4 then function = -1: exit function
          inbufcuraddr = strptr(inbuf)
          inbufendaddr = inbufcuraddr + inbufsize
    
          outbufsize = cvdwd(inbuf, 1): inbufcuraddr = inbufcuraddr + 4
          outbuf = space$(outbufsize)
          outbufcuraddr = strptr(outbuf)
    
          wndbuf = space$(%lzss_window_size + %lzss_match_length_maximum - 1)
          wndbufstartaddr = strptr(wndbuf)
          wndbufendaddr = wndbufstartaddr + %lzss_window_size
          wndbufcuraddr = wndbufstartaddr
    
          do
             if inbufcuraddr >= inbufendaddr then exit do
             bitmask = @inbufcuraddr: incr inbufcuraddr
             bitmaskadd = 1
             do
                if inbufcuraddr >= inbufendaddr then exit do
                if (bitmask and bitmaskadd) then
                   if wndbufcuraddr >= wndbufendaddr then wndbufcuraddr = wndbufstartaddr
                   @outbufcuraddr = @inbufcuraddr
                   @wndbufcuraddr = @inbufcuraddr
                   incr wndbufcuraddr: incr inbufcuraddr: incr outbufcuraddr
                else
                   lenandoffset = @inbufcuraddr + 256 * @inbufcuraddr[1]
                   inbufcuraddr = inbufcuraddr + 2
    
                   wndsuboff = lenandoffset and &hfff
                   shift right lenandoffset, 12
                   wndsublen = lenandoffset + %lzss_match_length_minimum
    
                   wndbufsubstraddr = wndbufstartaddr + wndsuboff
                   for wndbufposa = 1 to wndsublen
                      @outbufcuraddr = @wndbufsubstraddr
                      incr wndbufsubstraddr: incr outbufcuraddr
                   next
    
                   wndbufsubstraddr = outbufcuraddr - wndsublen
                   for wndbufposa = 1 to wndsublen
                      if wndbufcuraddr >= wndbufendaddr then wndbufcuraddr = wndbufstartaddr
                      @wndbufcuraddr = @wndbufsubstraddr
                      incr wndbufcuraddr
                      incr wndbufsubstraddr
                   next
                end if
    
                if bitmaskadd = 128 then exit do else bitmaskadd = bitmaskadd + bitmaskadd
             loop
          loop
    
    
       end function
    
       function pbmain
          local inbuf as string, inbuf2 as string, outbuf as string
          local f as long, t1 as single, t2 as single, t3 as single
          f = freefile: errclear: open $sourcefile for binary as #f
          if err = 0 then get$ #f, lof(f), inbuf
          close #f
          if err then msgbox "can't read the source file": exit function
    
          t1 = timer
          if compress_lzss (inbuf, outbuf) < 0 then msgbox "can't compress": exit function
          t2 = timer
          if decompress_lzss (outbuf, inbuf2)  then msgbox "can't decompress": exit function
          t3 = timer
    
          f = freefile: errclear: open $targetfile for output as #f
          if err = 0 then print #f, outbuf;
          close #f
          if err then msgbox "can't write the target file": exit function
    
          msgbox "compress: " + format$(1000# * (t2 - t1), "#") + " ms" + $crlf + _
                 "decompress: " + format$(1000# * (t3 - t2), "#") + " ms" + $crlf + _
                 "ratio: " + format$(len(inbuf)) + " -> " + format$(len(outbuf))
    
          if inbuf <> inbuf2 then msgbox "problems"
    
        end function

    [this message has been edited by semen matusovski (edited june 02, 2002).]

  • #2
    Thanks Semen,

    I have no idea, why Don named this algo "LZ77".
    Just for people who are wondering whether this is LZ77 or not....

    LZSS is a version of LZ77 developed by "storer" & "Szymanski" in '82. It improves on the original in three ways:

    1. It holds the look ahead buffer in a cicular queue
    2. It holds the dictionary in a binary tree
    3. It creates tokens in two fields instead of three.

    The deficiency of all LZ77 varients is that they make the assumption that patterns in the imput data occur close together, data streams that don't meet this req tend to compress poorly.

    To be honest this algo works very well in my opinion.

    (Hopefully that's enough useful info to warrent a mention in the source section

    ------------------
    Paul Dwyer
    Network Engineer
    Aussie in Tokyo

    Comment


    • #3
      Reminder: LZSS is a variant of LZ77. In the U.S., several patent-related issues affect use of LZ77 and its derivatives.


      ------------------
      -- gturgeon at compuserve dot com --

      Comment

      Working...
      X