Announcement

Collapse
No announcement yet.

The vulnerability of scripted compilers ...

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The vulnerability of scripted compilers ...

    While PowerBasic compiles PB source code into true 32-bit machine code not all compilers do. Many less advanced compilers can still create executables, but instead of translating the source to true machine code like PB does they simply embed the source within the executable, usually compressed or encrypted to protect the source code from casual viewing. The source is then decrypted at runtime, and the script engine built into the EXE runs it. However, virtually all such compilers of that nature are vulnerable to having source code revealed.

    Here is a simple program to extract the original source code from any OBasic compiled executable (obasic.com), and as you can see it really is very simple. The same principle can be used on these sorts of 'scripted compilers' from all sorts of languages.

    Code:
    #COMPILE EXE 'PBCC
    #INCLUDE "win32api.inc"
     
    SUB KillProcess(BYVAL ProcID AS LONG)
     LOCAL hProc AS LONG
     hProc = OpenProcess(BYVAL %PROCESS_TERMINATE, BYVAL 1, BYVAL ProcID)
     IF hProc <> %NULL THEN TerminateProcess BYVAL hProc, BYVAL %NULL
     CloseHandle hProc
    END SUB
     
    SUB ObasicDebug(szEXE AS ASCIIZ)
     LOCAL DE AS DEBUG_EVENT, SI AS STARTUPINFO, PI AS PROCESS_INFORMATION
     LOCAL sTemp AS STRING, hFile AS DWORD, sBuf AS STRING, szCmdLine AS ASCIIZ * 1, lRes AS DWORD, szPtr AS ASCIIZ PTR
     sTemp = ENVIRON$("TEMP"): IF RIGHT$(sTemp,1) = "\" THEN sTemp = LEFT$(sTemp, LEN(sTemp) - 1)
     hFile = FREEFILE
     OPEN szExe FOR BINARY ACCESS READ LOCK SHARED AS #hFile
      GET$ #hFile, LOF(hFile), sBuf
     CLOSE #hFile
     IF INSTR(1, sBuf, "OBASIC") = 0 THEN
         STDOUT "Invalid OBasic executable"
         EXIT SUB
     END IF
     szPtr = STRPTR(sBuf) + INSTR(-1, sBuf, "OBASIC") + 21
     STDOUT "Source Filename=" & @szPtr
     CreateProcess(szEXE, szCmdLine, BYVAL %NULL, BYVAL %NULL, 0, %DEBUG_PROCESS OR %NORMAL_PRIORITY_CLASS OR %PROCESS_VM_READ, BYVAL %NULL, BYVAL %NULL, SI, PI)
     DO
      lRes = WaitForDebugEvent(DE, %INFINITE)
      IF lRes <> 0 THEN
         IF DE.dwDebugEventCode = %LOAD_DLL_DEBUG_EVENT THEN
            IF DIR$(sTemp & "\" & @szPtr,39) <> "" THEN
                FILECOPY sTemp & "\" & @szPtr, szExe & "-source.txt"
                STDOUT "Saved to " & szExe & "-source.txt"
                KillProcess BYVAL PI.dwProcessID
                EXIT SUB
            END IF
         END IF
         ContinueDebugEvent DE.dwProcessID, DE.dwThreadID, %DBG_CONTINUE
      END IF
     LOOP
    END SUB
     
    FUNCTION PBMAIN() AS LONG
    LOCAL szEXE AS ASCIIZ * %MAX_PATH
    szEXE = COMMAND$   '// szEXE = "E:\obasic\test.exe"
    IF szEXE = "" THEN
        STDOUT "USAGE: obas2src <target.exe>":   GOTO TheEnd
    END IF
    IF DIR$(szEXE, 39) = "" THEN
        STDOUT "File not found - " & szEXE:   GOTO TheEnd
    END IF
    ObasicDebug szEXE
    TheEnd:
     STDOUT "Press any key to continue . . .";
     WAITKEY$
    END FUNCTION
    -

  • #2
    [...] usually compressed or encrypted to protect the source code from casual viewing.
    My impression is that it's not done to obfuscate the source code, but to optimize the resulting executable. It's precompiled in Tokens that are easier to interpret at runtime for the script engine than the plain source code.

    Comment


    • #3
      Knuth, that depends on the individual compiler - some use tokenization, others just include the script 'as is'. In the case of OBasic the full source code can be retrieved exactly as it was in the source file - there's no tokens. This was the same as older versions of Pyxia's IBasic until I brought it to the authors attention, altho IBasic has since been discontinued. In both cases the source code is essentially just encrypted to hide it from casual viewing, but for example in the case of OBasic the executable actually writes the (decrypted) source code to a file on disk (at which point my demo program scoops it up), runs it like a script, then deletes the file.

      The main point is that if the compiler is simply including its script/source code 'as is' in the executable, then it doesn't matter if it's compressed or encrypted as the executable has to decompress/decrypt that script back to its original form before it can be ran, at which time it is vulnerable to being dumped
      Last edited by Wayne Diamond; 17 Jun 2008, 06:51 AM.
      -

      Comment


      • #4
        That are not "compilers" at all. They are parsers/interpreters that produce a bundled executable in which they put the source code of the program (obfuscated or not) the full parser/interpreter engine and all needed extenal libs.

        It is a complete different story.
        thinBasic programming language
        Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

        Comment


        • #5
          Hey Eros

          By "compiler" I simply (loosely) meant something which creates a .EXE file that can have its code executed, either by translation (ie. scripts) or otherwise. You say that programs such as OBasic aren't true compilers, and I'd have to agree with you in that they aren't true compilers (they don't compile the source to true 16/32/64 bit code), however they still can create .EXEs, which is why I made this thread - to show that these "pseudo-compilers" are very vulnerable to source code extraction/decompilation.

          The example is simply to show that such programs are often vulnerable to having their source code extracted, so lets not lose sleep over the definition of a "compiler"
          Last edited by Wayne Diamond; 18 Jun 2008, 08:31 AM.
          -

          Comment


          • #6
            Yes, I know and you are right to mention those problems.

            I think the bad side is only if the authors of those engines say their language can be "compiled" instead of being honest and say they create a sort of box where the engine and the script is inside as objects handled when the produced exe is executed. I do not know OBasic so I don't know if in their help they say "compile" or something different.

            For example also thinBasic (like many other script engine) can create EXE..cutables but in this case we do not say we "compile" but we say we create "bundled EXE" and describe how EXE is done and what will happen when that EXE is executed. We also obfuscate script but this is another story.

            Like you, I've seen many BASIC dialects (and other languages) telling they produce compiled code while inreality they were producing an zipped executable. And this is not good for credibility. In some cases I've also seen forum discussions where people were comparing execution speed of compiled code compared to bundled EXE because people didn't recognise the difference. Both were EXE at the end
            thinBasic programming language
            Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

            Comment

            Working...
            X