Announcement

Collapse
No announcement yet.

Regular Expressions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Regular Expressions

    The translator needs to handle Regular Expressions. Unfortunately, this is not something I do very well and the references I've been able to find don't actually give a whole lot of How-To information, at least not in terms that will sink into my brain these days.

    The problem: I'm constructing syntax expressions for VB6 and PBWin10. I've made a lot of progress, but neither language seems to use a standard format. I mean other than the charset "(){}[]|<>" and "...". The "<>" is problematic, but not too difficult to solve. The "..." varies appearing as "...." or ",...,..." with non-standard spacing. The varnames often are tailored to the specific term being defined even when the root classes would be "string$" or "long&" or similar. That's great for a help page, but it means several thousand manual edits to get it standardized.

    Can anyone recommend a simple (or simple-minded) method/system.
    Do not go quiet into that good night,
    ... Rage, rage against the dark.

    #2
    I've made a lot of progress, but neither language seems to use a standard format
    Standard? There are AT LEAST four "standard" variations of regular expressions I have encountered: Unix, Perl, Ultra-Edit and, yes, PowerBASIC.

    I would not be at all surprised if Microsoft/Visual BASIC had its own variation, too. Come to think of it, I think there IS a Microsoft regular Expression API implemented via COM, unless that's part of either the c or VB runtimes. Try your com browser for registered libraries.

    >Can anyone recommend a simple (or simple-minded) method/system.

    Identify source code with comment "no direct translation possible."
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


      #3
      VB6 did not have regular expressions in-built as PowerBASIC does. As Michael pointed out they are handled by an external COM object.
      Specifically "Microsoft VBScript Regular Expressions 5.5" in "VBSCRIPT.DLL"

      So there are three options. The easiest is to reference the VBScript COM object, thus leaving the translation to that of translating the object code.

      The second is probably most favorable to PBers. Create a PB version of the VBscript RegEx COM object and use that. (You might be able to get someone to create this for you, or at least help as it could be used as an example for both COM object creation and RegExpr use for the online help!)

      The third option is to reference the syntax for both VB/JavaScript Regex (http://msdn.microsoft.com/en-us/library/1400241x.aspx) and the PB RegExpr and translate between the 2 sub dialects.
      Kind regards,
      Neil

      http://www.BASICProgramming.info

      Comment


        #4
        Stan,
        Reading the second paragraph of your post, you seem to be looking to use regular expressions as an aid in translation between VB6 and PB.

        You might want to try Regular Expression builders like http://www.regexr.com/ and http://txt2re.com/ and http://regex101.com/ which can help you create the patterns you want.

        With you comment on variable names and root classes.
        Hungarian notation was popular with people wanting VB code (not necessarily the coders themselves) as it was reputed to aid in the readability and maintainability of that code.
        For example integer variables were prefixed with "int", longs with "lng" and strings with "str". Also some controls and objects were prefixed also to identify them. Textboxes were prefixed with "txt", command buttons with "cmd", combo boxes with "cbo" etc.

        The prefixing was not necessary but some people (quite a lot) thought that it was nice to have.

        I would suggest leaving variable names alone in translation unless forced to change because of reserved word conflict.

        I think that the current standard for Regular Expressions is contained in the ECMA 262 Script standard (somewhere around section 15)http://www.ecma-international.org/ecma-262/5.1/, which is supported in MS JScript and MS VBScript (Windows Script 5.5).
        Kind regards,
        Neil

        http://www.BASICProgramming.info

        Comment


          #5
          Originally posted by Michael Mattias View Post
          ...
          >Can anyone recommend a simple (or simple-minded) method/system.

          Identify source code with comment "no direct translation possible."
          Yeah, I knew I should have left that parenthetical out.
          Do not go quiet into that good night,
          ... Rage, rage against the dark.

          Comment


            #6
            Neil,

            Thanks for the reminder on Hungarian. I can screen for that before calling a couple FUNCTIONS and save some real time in the processing. Plan to leave varNames alone as you suggested -- just need to get things like type, scope, and syntax.

            Thanks for the reference links. I'll be looking at them in the morning. Plan to check the "VBSCRIPT.DLL" via COM browser.

            Thank you,
            Stan
            Do not go quiet into that good night,
            ... Rage, rage against the dark.

            Comment


              #7
              So there are three options. The easiest is to reference the VBScript COM object, thus leaving the translation to that of translating the object code.
              That may be the best option, as patterns and masks used are different, as are capabilities.

              e.g, PB regular expressions do not support the "repeat" option. In most RE implementations you can specify "five numeric digits" as
              Code:
                mask  = "[0-9,5]"
              but PB does not offer that option so you have to use
              Code:
                mask = "[0-9][0-9][0-9][0-9][0-9]"
              There may be additional "not supported by PB RE options" which do NOT translate so easily.

              And I think tackling dynamic masks is going to be extremely difficult.

              MCM
              Michael Mattias
              Tal Systems (retired)
              Port Washington WI USA
              [email protected]
              http://www.talsystems.com

              Comment


                #8
                Thank you everyone for the input. I have decided to forego using Regular Expressions because of the combined problems of complexity and my own imminent learning curve. Instead, I'll be using a much simpler home-made system based on my own understanding of the concepts. The end result will probably not conform to any standard other than my own. I think the end result will be easier to follow for anyone who wants to add functionality to the Translator.
                Do not go quiet into that good night,
                ... Rage, rage against the dark.

                Comment


                  #9
                  >The end result will probably not conform to any standard other than my own

                  Oh, good! Yet another regular expresssion "standard!"
                  Michael Mattias
                  Tal Systems (retired)
                  Port Washington WI USA
                  [email protected]
                  http://www.talsystems.com

                  Comment


                    #10
                    Originally posted by Michael Mattias View Post
                    >The end result will probably not conform to any standard other than my own

                    Oh, good! Yet another regular expresssion "standard!"

                    I see that you're impressed, but seriously, why build something that can handle rocket science in real time when all we really need is a dependable way to read syntax templates for one obsolete programming language? Not looking for a Nobel Prize here; just want to get the job done.

                    I'm not planning to publish it anyway except as part of the open source for the translator. When we get around to other languages the people who do that are free to modify mine or write their own.

                    Wow! Just figured out what to call it --- FUNCTION SyntaxTemplateReader(VBterm AS STRING) AS STRING
                    Last edited by StanHelton; 1 Nov 2014, 06:02 PM. Reason: Add comment
                    Do not go quiet into that good night,
                    ... Rage, rage against the dark.

                    Comment


                      #11
                      I found Jose' Regexp VB wrapper to work pretty well on a single line... doesn't do well with multiline strings thou (any string that contains CRLF in it)

                      Even if you think you don’t love mathematics,
                      mathematics loves you. Don’t believe me?
                      Solve the following for “i”.
                      9x – 7i > 3(3x – 7u)

                      Comment

                      Working...
                      X
                      😀
                      🥰
                      🤢
                      😎
                      😡
                      👍
                      👎