Announcement

Collapse
No announcement yet.

Text from *.chm

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Text from *.chm

    I have a need to programmatically access a *.chm file for all the various constituent parts of each item in the file and use them as strings withing my program. But I have developed a mental block, and can't even deduce my starting point, and to the point where when I search the fora I'm no longer sure of what I'm looking at.
    Does anyone have anything that does anything along this line, or can they point me in the right direction?
    TIA
    Rod
    In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

  • #2
    There's no need to parse out text from a compiled CHM file.

    To access pieces of text from CHM, you can use the HH_DISPLAY_TOPIC or HH_HELP_CONTEXT commands as parameter three in your calls to HTMLHelp.

    (Surely you can't mean you want text from a CHM file not of your own making... but if you do I'd check the copyright and permitted uses statement of said file.)

    MCM
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    m[email protected]
    http://www.talsystems.com

    Comment


    • #3
      But I do have a need to parse the text, and of a file not of my own making.
      I am confident that since this is an educational program, making use of a file already purchased by the user of my program to further their use of the Vendor's product that the copyright issue is moot. I will not be substantially changing the text, just displaying in a slightly different manner. Nor will I be charging for the program. No extra copies of the file are necessary either.
      I do appreciate the thought, for I hadn't considered that aspect.
      Rod
      In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

      Comment


      • #4
        A "Google" for "file format CHM" turned up this, among others...

        http://www.fileinfo.net/extension/chm
        Michael Mattias
        Tal Systems (retired)
        Port Washington WI USA
        [email protected]
        http://www.talsystems.com

        Comment


        • #5
          On a hunch I fired up the COM browser...
          Code:
           Generated by: PowerBASIC COM Browser v.2.00.0058
          ' DateTime    : 12/7/2008 at 9:58 AM
          ' ------------------------------------------------
          ' Library Name: HHCTRLLib
          ' Library File: C:\WINDOWS\system32\hhctrl.ocx
          ' Description : HHCtrl 4.0 Type Library
          ' GUID : {ADB880A2-D8FF-11CF-9377-00AA003B7A11}
          ' LCID : 0
          ' Version : 4.0
          ...
          ..  {lots more}
          I'd think there might be something in there you can use.....

          MCM
          Michael Mattias
          Tal Systems (retired)
          Port Washington WI USA
          [email protected]
          http://www.talsystems.com

          Comment


          • #6
            Yeah, there might be something in there that I can use, will have to use, and I just can't tell exactly how to go about it.

            Multi-spur sidetracking while experimenting.
            Good thing the completion date wasn't etched in stone.

            Many^many thanks!(that may look like sarcasm, but it's not)
            Rod
            In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

            Comment


            • #7
              I don't know how to use all that stuff either; however, when I first started playing around with ADO, I noticed the COM browser found a LOT of 'registered libraries' on my system (611).

              So I wuz thinkin'.... maybe there is some useful stuff in there and maybe I should try to familiarize myself with it.

              Far as I can tell, calling a property or method is pretty much the same as calling a sub or function, except for the call syntax.

              (The "isolation" offered by classes and instances is not much benefit to me... since I've always coded pretty well in that regard anyway).
              Michael Mattias
              Tal Systems (retired)
              Port Washington WI USA
              [email protected]
              http://www.talsystems.com

              Comment


              • #8
                On a hunch I fired up the COM browser...
                Don't bother, that is a dead end.

                You will need to understand structured storage and how to used the methods of the IStorage interface.
                Do a search on "CHM Specification" or "ITSF CHM", and if after reading all that you are still willing
                to proceed, I can get you started.
                Dominic Mitchell
                Phoenix Visual Designer
                http://www.phnxthunder.com

                Comment


                • #9
                  742 on my machine. Over 70 from HP alone, and Roxio has a large chunk too.

                  Thanks Dominic, I'll go looking.
                  Rod
                  In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

                  Comment


                  • #10
                    google chm decompiler

                    James

                    Comment


                    • #11
                      >Over 70 [registered COM libraries] from HP alone

                      I have about 150 from HP... and all I did was install two printers. However, you can't install an HP printer without installing "Manage my photo album" and "Do image editing" and "Drive a submarine under the North Pole with your hands tied behind your back (blindfolded)," too.

                      Sheesh, whatever happened to "here's the floppy disk containing your new printer driver?"

                      MCM
                      Michael Mattias
                      Tal Systems (retired)
                      Port Washington WI USA
                      [email protected]
                      http://www.talsystems.com

                      Comment


                      • #12
                        The whole url can be used with MSIE and MSIE (afaik) has an option to save as text:

                        So i placed:
                        mk:@MSITStore:C:\Program%20Files\PwrDev2\PwrDev.chm::/calls/VD_INI_ByteToHex.htm

                        into MSIE and used save as .. txt and did it fine.
                        I think you can program that behaviour ??
                        hellobasic

                        Comment


                        • #13
                          Oh btw, the decompiler way may also work for you.
                          The htmlhelp compile can decompile these files for you.
                          Then you'll get htm files which may be easier for you to process with MSIE > save as..
                          hellobasic

                          Comment


                          • #14
                            I have about 150 from HP... and all I did was install two printers. However, you can't install an HP printer without installing "Manage my photo album" and "Do image editing" and "Drive a submarine under the North Pole with your hands tied behind your back (blindfolded)," too.

                            Sheesh, whatever happened to "here's the floppy disk containing your new printer driver?"
                            Perhaps it's time to coin a new term, somewhat akin to bloatware. My suggestion is "COM.BLOAT"

                            I'll give each of these scenarios/means a looksee, thank you very much for the suggestions, folks.

                            When I have found success, I'll report it.
                            Rod
                            In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

                            Comment


                            • #15
                              Or "COM.STIPATION"
                              Rgds, Dave

                              Comment


                              • #16
                                Well Dominic, it looks like you win.

                                I tried a decompiler, as jcfuller suggested, and while it does a job of sorts, it gives me a parsable text file, it also may infringe on copyrights as Michael pointed out. It means creating a *.txt file out of the gist of a *.chm file and accessing the *.txt file by the program I envision. This complexity I think I should avoid.

                                The same pretty much applies to Edwin's suggestion, although I kept getting errors when attempting his method. I searched for solutions to the errors and I didn't find one that was obviously applicable.

                                First a couple of links:

                                http://com.it-berater.org/COM/struct...s/IStorage.htm
                                http://com.it-berater.org/COM/struct...es/IStream.htm

                                So now I got a lotta learnin to do. I suppose it's my fault that as someone that first learned to program in the 80's that I didn't pay particular attention to some of the insidious changes that were creeping into, or perhaps growing predominant in creating a help file.

                                Mumble....grumble...fumble...tumble...MUMBLE.
                                Last edited by Rodney Hicks; 10 Dec 2008, 04:56 AM. Reason: speilling
                                Rod
                                In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

                                Comment


                                • #17
                                  This from the Wikipedia on CHM:

                                  In 2002, Microsoft announced some security risks associated with the .CHM format, as well as some security bulletins and patches.[1] They have since announced their intentions not to develop the .CHM format further, and will be moving to a new generation of Windows Help called Microsoft Assistance Markup Language(MAML) in the Windows Vista operating system.(bracketed mine)
                                  And from the MAML page:
                                  The MAML authoring structure is divided into segments related to a type of content: conceptual, FAQ, glossary, procedure, reference, reusable content, task, troubleshooting, and tutorial.

                                  Three levels of transformation occur when a topic displays: structure, presentation, and rendering.
                                  So, if I go ahead and fiddle with *.chm format(which is what is currently out there), how long before the *.chm format is no longer supported by software vendors? In other words, how much time do I have to do it all over again?

                                  Oh well. At least the principles/techniques I learn should last a week or two longer than the format.
                                  Rod
                                  In some future era, dark matter and dark energy will only be found in Astronomy's Dark Ages.

                                  Comment


                                  • #18
                                    I used to work for BlueSky software (which later became eHelp), the makers of RoboHELP. I have been hearing promises of the CHM file format going away for over 10 years now. First it was going to be Help 2.0 and now they say it will be MAML. Eventually it will come, but we have been promised a new help format for so long now...
                                    Sincerely,

                                    Steve Rossell
                                    PowerBASIC Staff

                                    Comment


                                    • #19
                                      many times the chm will include html files used to access various parts. PB is that way (or at least you can get them). Therefore much if not all of the text can be found in these html type files.
                                      Last edited by Fred Buffington; 10 Dec 2008, 05:00 PM.
                                      Client Writeup for the CPA

                                      buffs.proboards2.com

                                      Links Page

                                      Comment


                                      • #20
                                        Content of chm files blocked by security patch

                                        The real problem with .chm-files nowadays is that Microsoft updated XP with a security-patch locking the content when the HTML helpfile has been imported from a remote computer. Is this also the case when it is part of a downloaded or e-mailed setup package? Unfort. I could not test that.

                                        It's no real headache to unlock it via the file's properties pane (right mouse button) but the real pain is, of course, that the users of our brilliant applications are spoiled because everything always used to work flawlessly . They simply don't want to spend their costly time in order to find out how to unlock the content of just a simple help file.

                                        The screenshot attached is in Dutch. The left pane (year numbers in this case) is correctly displayed, but the right pane says: "Navigation to the webpage cancelled - Possible actions: re-supply the url". This action is, imo, the most stupid Microsoft (i.e. IE7) ever recommended. Supply an internet address? How? We try to cope with a compiled HTML-file here, which IE7 obviously does not recognize as such. Instead it "thinks" that a living human typed an incorrect url.

                                        And what's more: This is a file I created myself and then e-mailed it to myself, so it does not even come from a remote computer. It only made a short voyage along the internet.
                                        Attached Files
                                        Last edited by Egbert Zijlema; 10 Dec 2008, 05:39 PM. Reason: In total 4 edits for additions and corrections

                                        Egbert Zijlema, journalist and programmer (zijlema at basicguru dot eu)
                                        http://zijlema.basicguru.eu
                                        *** Opinions expressed here are not necessarily untrue ***

                                        Comment

                                        Working...
                                        X