Announcement

Collapse
No announcement yet.

Garbled Files

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Garbled Files

    I've noticed that many files you see now, if you were to open them with a text editor, they are garbled with nonsense characters (nonsense to me anyway). I am writing a simple program that logs some user inputed data into a file sequentialy. How do i make my file look like a garbled file so they can't open it with a text editor?

  • #2
    The files you're talking about are probably not intentionally garbled, they're just not text files. What file extensions are you refering to?

    For your log files, how hard do you want to make for someone else to read?

    You could rotate each character one bit left or right to make it look like rubbish in a text editor. Not secure because it is easy to undo once the trick is figured out. Keeps casual viewers out.

    Or, very secure with encryption? Or something in between?

    Cheers,
    Dale

    Comment


    • #3
      Well, if you were to open an exe file for any program in a text editor, its all garbled. How could i write a basic string to a file to look like that.

      Comment


      • #4
        Here is one way to do it:

        Code:
        #COMPILE EXE
        #DIM ALL
        
        FUNCTION PBMAIN () AS LONG
        
            LOCAL keyWord AS SINGLE
            LOCAL ii AS LONG
            LOCAL dataIwantWritten, garbledData AS STRING
            
        '***********************************************************************************************************
                                     'this section will garble your data...
            keyWord = 3.49122e-22    '<< you choose any SINGLE type number here. You could even prompt for it as a sort of password.
            RANDOMIZE keyWord
            OPEN "c:\garbledData01.dat" FOR BINARY AS #1
        
            dataIwantWritten = REPEAT$(100, "JoelBoyer, 19900 Shadylane, Mytown, BC 103-221 ph# 902-339-2354" & $CRLF)
        
            garbledData = SPACE$(LEN(dataIwantWritten))
        
            FOR ii = 1 TO LEN(dataIwantWritten)
               ASC(garbledData, ii) = ASC(dataIwantWritten, ii) XOR RND(0, 255) '<< this garbles each character of your text
            NEXT
            PUT #1, , garbledData    '<< write it to a file so you can verify it is garbled
            CLOSE
            ? "Data has been garbled and stored in c:\garbledData01.dat"
            
        '***********************************************************************************************************
                                     'now to decode your garbled data, reverse the above process...
            RANDOMIZE keyWord        'be sure to use same keyword as before
            OPEN "c:\garbledData01.dat" FOR BINARY AS #1
            OPEN "c:\unGarbledData01.txt" FOR BINARY AS #2
            
            GET$ #1, LOF(#1), garbledData
            
            dataIwantWritten = SPACE$(LEN(garbledData))
            
            FOR ii = 1 TO LEN(garbledData)
               ASC(dataIwantWritten, ii) = ASC(garbledData, ii) XOR RND(0, 255) '<< this un-garbles each character of your text
            NEXT
            PUT #2, , dataIwantWritten    '<< write it to a file so you can verify it is un-garbled
            CLOSE
            ? "Data has been un-garbled and stored in c:\unGarbledData01.txt"
            
        END FUNCTION
        Last edited by John Gleason; 29 May 2008, 04:34 PM. Reason: reduced keyWord value to 6 signif. digits & added expon.

        Comment


        • #5
          awesome, thats exactly what i was looking for. thanks a lot!

          Comment


          • #6
            Most numbers when written to a file with a UDT will look 'garbled' This is because the number will be written in the 'compressed' form.

            That is, for example, a long integer will be written as 4 bytes.
            The number 1, for example, would be chr$(1)+chr$(0)+chr$(0)+chr$(0)

            Type Mytype
            mynum as long
            end type

            dim ptype as mytype
            num&=1
            ptype.mynum=num&
            print #1,ptype

            resulting in
            chr$(1)+chr$(0)+chr$(0)+chr$(0)+chr$(13)+chr$(10)

            In other words it's the same result as an mk?$ function.
            print #1,MKL$(num&)
            Last edited by Fred Buffington; 1 Jun 2008, 06:57 PM.
            Client Writeup for the CPA

            buffs.proboards2.com

            Links Page

            Comment


            • #7
              Joel, keep in mind that XOR'ing bytes isn't really encryption, its just a weak cipher. There are many encryption algorithms here at the forums (and compression algorithms, which achieve a similar result to some extent) so I'd recommend you use one of those rather than any 'home-made' ciphers!
              If you're worried about size or complexity then perhaps have a look at TEA (Tiny Encryption Algorithm), as that is probably more than strong enough for your needs yet still small and easy to implement.

              Usually when XORing is employed as a cipher the programmer just picks any one of the 256 available bytes, and uses that as the cipher key, thinking that "if it LOOKS encrypted enough then that's probably good enough".

              But a trained eye can often recognise weakly scrambled text such as that of XORing with a single byte, and then it's just a matter of having a look at all 256 combinations, which just takes a matter of seconds. Likewise, it is also very vulnerable to frequency analysis attacks, especially if the original plaintext contains text.

              It's particularly vulnerable when the attacker is able to feed your algorithm custom input to analyse the output. For example, John Gleason's example uses random bytes so it is stronger than a single byte key, and the key is essentially of an unlimited length (or at least until the RNG cycles), but because the seed of the random number generator is static the output is predictable.

              For example ...
              Code:
              LOCAL keyword AS SINGLE, i AS DWORD
              keyWord = 3.49122e-22
              RANDOMIZE keyWord
              FOR i = 1 TO 5: STDOUT HEX$(RND(0, 255)) & " "; :NEXT
              Even though it's pseudo-random it will always produce the same output because of the static seed:
              B3 5E B2 D6 F5
              That is essentially your encryption key. (Shortened to 5 bytes for this example, but the size isn't relevant in regards to cracking it)

              Now pretend for a moment that I'm the attacker. I don't know the above key - only you do. Also, I have no access to your source code, so all I can do is analyse the output of your encryption function, not the function itself.

              If I then feed your program "AAAAA" (A being ascii 0x41), your program would cipher it like this:
              B3 xor 41 = F2
              5E xor 41 = 1F
              B2 xor 41 = F3
              D6 xor 41 = 97
              F5 xor 41 = B4

              So your program has now turned my 41 41 41 41 41 into F2 1F F3 97 B4.

              The key to an XOR is that if you know 2 of the values you can easily determine the 3rd. For example 2 xor 5 = 7, 2 xor 7 = 5, and 5 xor 7 = 2. And that is basically why XOR on its own provides no security, even if you use very large keys.

              So now to figure out what your key is I simply XOR the original plaintext together with your resulting ciphertext ...
              F2 xor 41 = B3
              1F xor 41 = 5E
              F3 xor 41 = B2
              97 xor 41 = D6
              B4 xor 41 = F5

              Game over. Without ever having even seen your encryption function I now have your key (or at least the first 5 bytes of it in this example - obviously it'd be just as easy to work out the rest, and programatically too), and would be able to decrypt the entire file.
              Last edited by Wayne Diamond; 11 Jun 2008, 06:00 AM.
              -

              Comment


              • #8
                Originally posted by Wayne Diamond View Post
                For example ...
                Code:
                LOCAL keyword AS SINGLE, i AS DWORD
                keyWord = 3.49122e-22
                RANDOMIZE keyWord
                FOR i = 1 TO 5: STDOUT HEX$(RND(0, 255)) & " "; :NEXT
                Even though it's pseudo-random it will always produce the same output because of the static seed:
                B3 5E B2 D6 F5
                That is essentially your encryption key. (Shortened to 5 bytes for this example, but the size isn't relevant in regards to cracking it)

                Now pretend for a moment that I'm the attacker....
                Wayne, I have a question re. above: If you were to choose your own single keyword above from the ~2^32 possible, and prompt for it rather then coding it into the program, would that prevent the "immediate" cracking of it using your technique?

                I'm in no way representing my code above as secure--it was meant only to turn text to binary simply (garble it, so to speak), so it can't be directly edited/read in its file after user input. But out of curiosity, I generated an unknown random keyword for it and it took me 40 minutes to find it, where as it looks like your technique might take what, maybe 40 micro-seconds? Can it be done that fast even if prompting for the keyWord?

                Comment


                • #9
                  Wayne, I have a question re. above: If you were to choose your own single keyword above from the ~2^32 possible, and prompt for it rather then coding it into the program, would that prevent the "immediate" cracking of it using your technique?
                  It actually doesn't matter what key you use, whether it's a single byte or one of a seemingly infinite length, such as that produced by RANDOMIZE [seed]. Nor does it matter if you store the key in the program or not (there is no disassembly/debugging at all required in this attack). So no, prompting for a password/key won't add any security.

                  The only thing that matters is that if I can deduce two of the three values in your XOR calculation then I can immediately deduce the 3rd - your key.
                  In other words, the 'security' of XOR requires that the attacker only has one of the three values (the resulting ciphertext). If any two of the three values are known then it's game over, and if the attacker can send your algorithm his own input (thats one value) and get the resulting ciphertext (thats now two values) he can deduce the third.

                  Consider for example if MY secret key is the number 5. YOU then send a sequence of your own bytes as input to my algorithm (it doesn't matter what the sequence is as long as you know what it is). So for example, you might send 7, 7, 7, 7, 7 to my algorithm, in which case the output would be 2, 2, 2, 2, 2.

                  You then XOR 2 with 7, and voila ... you've got 5 - my secret key.

                  Multi-byte keys
                  ... it's exactly the same. Lets say my secret key is 5, 7, 2.
                  If you then send me custom input of say, 3, 3, 3, 3, 3, 3, 3, 3, 3, you'll get the following output: 6, 4, 1, 6, 4, 1, 6, 4, 1.
                  Usually you'd send it a longer input sequence than that, but you can see in this example that the custom input was still long enough to reveal that the cipher is repeating after every 3 bytes. In other words the key is essentially just 3 bytes.

                  So ...
                  6 xor 3 = 5
                  4 xor 3 = 7
                  1 xor 3 = 2
                  You've now got my key.

                  Keys as big as or bigger than the plaintext/ciphertext
                  In the case of the RANDOMIZE [seed] function the key it produces is almost of unlimited length (im not sure what its period is, but it should be quite large so the size of the key could be several terabytes in size).

                  Breaking these keys is actually no different to the above example of breaking a 3-byte key, BUT your custom input needs to be as long as the plaintext/ciphertext, although thats usually not a problem.

                  But out of curiosity, I generated an unknown random keyword for it and it took me 40 minutes to find it, where as it looks like your technique might take what, maybe 40 micro-seconds? Can it be done that fast even if prompting for the keyWord?
                  As long as I can send your algorithm my own custom input and get the resulting ciphertext then it's usually just a matter of 1-2 minutes.

                  btw, this isn't exclusively an XOR issue ... consider if you use addition/substitution instead for example.
                  If your key is say 3, and I input 5, 5, 5, 5, 5, 5, the resulting ciphertext would be 8, 8, 8, 8, 8, 8. I can then simply reverse the cipher by substituting 5 from 8 to get 3 - your key.

                  XOR on its own CAN actually be secure but only if correctly implemented in the form of a One Time Pad, but even then the security of that still requires that the attacker only has 1 of the 3 values (the ciphertext). As soon as the attacker has any two of the three values it's game over.
                  Last edited by Wayne Diamond; 16 Jun 2008, 03:33 AM.
                  -

                  Comment

                  Working...
                  X