No announcement yet.

Small tip regarding html to text conversions

  • Filter
  • Time
  • Show
Clear All
new posts

  • Small tip regarding html to text conversions

    A while ago i wrote me a tool enumming html/aspx files for a website to parse keywords.
    I used a webcontrol (ATL) to load these html's via te local webserver.
    I used document.innertext to obtain its text and then parsed the words.

    Today they told me to skip all anchors (links) and to solve that issue without parsing the whole html without the control i used document.innerhtml, removed the anchors with instr stuff and simply set it again via document.innerhtml.

    After that i used document.innertext again and gave me the newest text just fine.

    Just a simple tip to use the webcontrol in such a way.
    This can also be used to place any other html you have and parse it via the control and not via code.

    Btw, if you do have simple code, i'm still interested
    I ever seen a regular expression for that but in could not find that fella.