Announcement

Collapse
No announcement yet.

Getting a Web Page into a variable

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Getting a Web Page into a variable

    i'm using the example in the help file. however, i get a "file not found" when trying to retrieve a webpage. after much head scratching, i realized it was probably because my webserver uses 'virtual servers', which is multiple domains on a single ip address. i'm no html expert, but it apparently determines what webste to retrieve by the way the website is embedded in the request somehow.

    here's the code i'm using. anyone know how to retrieve a html file like this?

    [CODE]

    local buffer$, site$
    local entire_page$, htmlfile$, link$
    local pos&, length&

    site$ = "www.powerbasic.com"
    file$ = "http://-

  • #2
    try this

    Code:
    FUNCTION HtmlWebGet( BYVAL sURL AS STRING ) AS STRING
        LOCAL h$,f$,p&
        LOCAL szF AS ASCIIZ * 9
        LOCAL szU AS ASCIIZ * 255
        szF="temp.txt"
        REPLACE " " WITH "" IN sURL
        IF LEFT$(sURL,4)<>"http" THEN sURL = "http://" + sURL
        szU = TRIM$(sURL)
        IF URLDownloadToFile(BYVAL 0&, szU, szF, BYVAL 0&, BYVAL 0&) = 0 THEN
            f$ = szF
            p& = FREEFILE
            OPEN f$ FOR BINARY SHARED AS #p&
                GET$ #p&, LOF(#p&), h$
            CLOSE #p&
            KILL f$
            FUNCTION = h$
        END IF
    END FUNCTION
    Added: don’t remember where I got this
    Thanks to the original author.

    ------------------


    [This message has been edited by Stan Durham (edited March 13, 2006).]

    Comment


    • #3
      That works! However, I'm nervous about the temp files it creates. Is there a way to do this without the Temp files?

      ------------------

      Comment


      • #4
        Yes, using Microsoft WinHTTP Services. Check the version that you have
        installed to use the appropiate ProgID. The ProgID for version 5.0 is
        "WinHttp.WinHttpRequest.5". You can't use a version independent ProgID
        because versions 5.0 and 5.1 can be installed side by side.
        Code:
        #COMPILE EXE
        #DIM ALL
        
        FUNCTION PBMAIN () AS LONG
        
           LOCAL oWHttp AS DISPATCH
           LOCAL vMethod AS VARIANT
           LOCAL vUrl AS VARIANT
           LOCAL vResponseText AS VARIANT
           
           ' Create an instance of the HTTP service
           SET oWHttp = NEW DISPATCH IN "WinHttp.WinHttpRequest.5.1"  ' <-- change if needed
           IF ISFALSE ISOBJECT(oWHttp) THEN EXIT FUNCTION
        
           ' Open an HTTP connection to an HTTP resource
           vMethod = "GET"
           vUrl = "http://www.powerbasic.com/"
           OBJECT CALL oWHttp.Open(vMethod, vUrl)
        
           ' Send an HTTP request to the HTTP server
           OBJECT CALL oWHttp.Send
        
           ' Get the response entity body as a string
           OBJECT GET oWHttp.ResponseText TO vResponseText
        
           MSGBOX VARIANT$(vResponseText)
        
           IF ISOBJECT(oWHttp) THEN SET oWHttp = NOTHING
        
        END FUNCTION

        ------------------
        Website: http://com.it-berater.org
        SED Editor, TypeLib Browser, Wrappers for ADO, DAO, ODBC, OLE DB, SQL-DMO, WebBrowser Control, MSHTML, HTML Editing, CDOEX, MSXML, WMI, MSAGENT, Flash Player, Task Scheduler, Accesibility, Structured Storage, WinHTTP, Microsoft ActiveX Controls (Data Binding, ADODC, Flex Grid, Hierarchical Flex Grid, Masked Edit Control, DataList, DataCombo, MAPI, INET, MCI, Winsock, Common Dialog, MSChart, Outlook View Control), and Microsoft Scripting Components.
        Forum: http://www.jose.it-berater.org/smfforum/index.php

        Comment


        • #5
          This will work, just check the flags on the inetopen because I was doing something with PKI and SSL and never reset it.

          Code:
          '------------------------------------------------------------------------------------------
          Function CCSWebGet(sUrl As String, sRetBuffer As String)Export As Long
               Local lResult As Dword
               Local szBrowserHandle As Asciiz * 255
               Local hInternet As Dword
               Local hUrl As Dword
               Local szUrl As Asciiz * 2048
               Local szTempBuffer As Asciiz * 4096
               Local dwContext As Dword 'Pointer to proc
               Local szBuffer As Asciiz * 255
               Local ErrType  As Long
               Local ErMsg    As Asciiz * 255
          '     Local hData As Dword
          '     Local szHeaders As Asciiz * 512
          '     Local dwError As Dword
          
          'Need to insert registry writing for this key, DWORD value:
          'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_HTTP_USERNAME_PASSWORD_DISABLE
          'tngstatus.exe at 0
          
               '%INTERNET_OPTION_CLIENT_CERT_CONTEXT 'for PKI
               If IsTrue InternetAttemptConnect(ByVal %NULL) Then
                   MsgBox "Cannot connect to internet",%MB_ICONSTOP,"Error"
                   Exit Function
               End If
               Dim IBUFFER As Local INTERNET_BUFFERS
               Pc& = SetPriorityClass(GetCurrentProcess(), %HIGH_PRIORITY_CLASS)
               szBrowserHandle = "TNGStatus Webserver monitoring tool Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
               szUrl = sUrl
               hInternet = InternetOpen(szBrowserHandle, %INTERNET_OPEN_TYPE_DIRECT, ByVal %Null, ByVal %Null, %INTERNET_FLAG_ASYNC)
          '     hInternet = InternetConnect(hInternetOpen&, zDomain, %INTERNET_DEFAULT_HTTP_PORT, "","HTTP/1.0", %INTERNET_SERVICE_HTTP, 0, 0)
          
               lResult = InternetGetLastResponseInfo(dwError???, szBuffer, SizeOf(szBuffer))
               'hInternet is the handle for the connection if false get out:
               If IsFalse hInternet Then WEBERR
               'Good to here, hInternet is positive number
          
               If Left$(LCase$(szUrl),5)="https" Then
          '         InternetCanonicalizeUrl szUrl, szUrl, SizeOf(szUrl), %ICU_BROWSER_MODE
                   hUrl = InternetOpenUrl(hInternet, _
                                          szUrl, _
                                          ByVal 0&, _
                                          0, _
                                          %SECURITY_INTERNET_MASK,_
                                          dwContext)
                   ErrType = Err
               Else
                   hUrl = InternetOpenUrl(hInternet, _
                                          szUrl, _
                                          ByVal 0&, _
                                          0, _
                                          %INTERNET_FLAGS_MASK,_
                                          dwContext)
                   ErrType = Err
                   If IsFalse hUrl Then
          '              lResult= InternetErrorDlg(GetDesktopWindow(),_
          '                                        hInternet,_
          '                                        %ERROR_INTERNET_INVALID_CA or ERROR_INTERNET_SEC_CERT_CN_INVALID or , _
          '                                        %FLAGS_ERROR_UI_FILTER_FOR_ERRORS Or %FLAGS_ERROR_UI_FLAGS_GENERATE_DATA _
          '                                        Or %FLAGS_ERROR_UI_FLAGS_CHANGE_OPTIONS,%NULL)
                   End If
               End If
               If IsFalse hUrl Then WEBERR
               ErrType = GetLastError()
               IBUFFER.dwStructSize = SizeOf(IBUFFER)
               IBUFFER.dwBufferLength= 4096
               IBUFFER.lpvBuffer = VarPtr(szTempBuffer)
          
               While InternetReadFileEx(hUrl, IBUFFER, %IRF_ASYNC, dwContext) <> %FALSE And IBUFFER.dwBufferLength <> 0
                    sRetBuffer = sRetBuffer & Left$(szTempBuffer, IBUFFER.dwBufferLength)
               Wend
               InternetCloseHandle hInternet
               If Len(sRetBuffer) > 0 Then Function = %TRUE
          WEBERR:
               ErMsg = "Error connecting to " & $CrLf & $CrLf
               ErMsg = ErMsg & sUrl & $CrLf & $CrLf
               ErMsg = ErMsg & "Error: " & Format$(ErrType) & $CrLf
               ErMsg = ErMsg & "Description: " & InetErrorDescription(ErrType) & $CrLf
          
               lResult = InternetGetLastResponseInfo(ByVal VarPtr(ErrType), szBuffer, SizeOf(szBuffer))
               If IsTrue lResult Then
                   ErMsg = ErMsg & "ResponseInfo: " & szBuffer
               End If
          
               MessageBox ByVal 0,ErMsg, "WinInet Error", ByVal %MB_ICONSTOP
               InternetCloseHandle hInternet
               Pc& = SetPriorityClass(GetCurrentProcess(), %NORMAL_PRIORITY_CLASS)
          End Function
          '------------------------------------------------------------------------------------------
          ------------------
          Scott Turchin
          MCSE, MCP+I
          Computer Creations Software
          ----------------------
          Sometimes you give the world the best you got, and you get kicked in the teeth.
          Give the world the best you got anyway.
          - Ted Nugent (God, Guns, and Rock n' Roll)
          Scott Turchin
          MCSE, MCP+I
          http://www.tngbbs.com
          ----------------------
          True Karate-do is this: that in daily life, one's mind and body be trained and developed in a spirit of humility; and that in critical times, one be devoted utterly to the cause of justice. -Gichin Funakoshi

          Comment


          • #6
            Sounds like your host is using host headers, which pulls the host out of the web get header.

            In other words you won't be able to call by IP, or you'll get the default page - on Apache it may be nothing, (Or maybe an apache welcome page), on IIS it's a "Under construction" site.



            [This message has been edited by Scott Turchin (edited March 14, 2006).]
            Scott Turchin
            MCSE, MCP+I
            http://www.tngbbs.com
            ----------------------
            True Karate-do is this: that in daily life, one's mind and body be trained and developed in a spirit of humility; and that in critical times, one be devoted utterly to the cause of justice. -Gichin Funakoshi

            Comment


            • #7
              Using the PB TCP functions can have some advantages.
              You can catch some tricky redirect pages this way.
              You can view the actual HTML of the URL without
              being automatically redirected.


              ------------------

              Comment


              • #8
                I'd agree with Scott. I'd add a 'HOST: hostname.com' line in
                your request. I think it's supposed to be the second line...

                GET index.html HTTP/1.1
                Host: google.com

                etc...

                Hope this helps!

                John

                ------------------
                LOCAL MyEMail AS STRING , MySkype AS STRING
                MyEmail = STRREVERSE$("letnitj") & CHR$(64) & STRREVERSE$("liamg") & CHR$(46) & STRREVERSE$("moc")
                MySkype = STRREVERSE$("adirolftj")
                LOCAL MyEMail AS STRING
                MyEmail = STRREVERSE$("53pmohtj") & CHR$(64) & STRREVERSE$("liamg") & CHR$(46) & STRREVERSE$("moc")

                Comment


                • #9
                  If you use non-standard port numbers, you also want to make sure that you include them in the "Host" header field as well.

                  Just for fun, here's the SocketTools version of a WebGet function:

                  Code:
                  FUNCTION WebGetFile(strHostName AS STRING, strResource AS STRING, strBuffer AS STRING) AS LONG
                      DIM hClient AS LONG
                      DIM pszHostName AS ASCIIZ PTR
                      DIM pszResource AS ASCIIZ PTR
                      DIM bResult AS LONG
                  
                      bResult = %FALSE
                      strBuffer = ""
                      pszHostName = STRPTR(strHostName)
                      pszResource = STRPTR(strResource)
                      
                      hClient = HttpConnect(@pszHostName, _
                                            %HTTP_PORT_DEFAULT, _
                                            %HTTP_TIMEOUT, _
                                            %HTTP_OPTION_NONE, _
                                            %HTTP_VERSION_10)
                                            
                      IF hClient <> %INVALID_CLIENT THEN
                          DIM hgblBuffer AS DWORD
                          DIM lpBuffer AS ASCIIZ PTR
                          DIM dwLength AS DWORD
                          DIM nResult AS LONG
                  
                          nResult = HttpGetData(hClient, @pszResource, _
                                                BYREF hgblBuffer, dwLength, _
                                                %HTTP_TRANSFER_CONVERT)
                  
                          IF nResult <> %HTTP_ERROR THEN
                              lpBuffer = GlobalLock(hgblBuffer)
                              strBuffer = @lpBuffer
                              GlobalUnlock(hgblBuffer)
                              GlobalFree(hgblBuffer)
                              bResult = %TRUE
                          END IF
                          HttpDisconnect(hClient)
                      END IF
                  
                      FUNCTION = bResult
                  END FUNCTION
                  It'll return the contents of the file the strBuffer variable. If you want automatic redirection you can specify that in the HttpGetData function. It'll also automagically convert text resources so that 'end of line' sequences are handled correctly for Windows.

                  ------------------
                  Mike Stefanik
                  www.catalyst.com
                  Catalyst Development Corporation
                  Mike Stefanik
                  sockettools.com

                  Comment

                  Working...
                  X