Announcement

Collapse
No announcement yet.

WinHttpRequest Redirect to Secure Pages Fails?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    . . . or messages that they were "blocking scraping".
    If they tell you "don't do that", then don't and you won't get blocked by them. Why should anyone here help do an unwelcome thing?
    Dale

    Comment


    • #42
      Originally posted by Dale Yarker View Post
      If they tell you "don't do that", then don't and you won't get blocked by them. Why should anyone here help do an unwelcome thing?
      I'm not doing anything illegal or unwelcome. These are RSS feeds for cooking blogs that are set up for the sole purpose of being downloaded by newsreaders. The whole idea is to share the pages with others. They're not saying "don't do that", they're saying "don't do it that way". So, I'm trying to find the best way to do it right.

      I only run into the security responses with INTERNETOPENURL. Other download methods work fine (though I've had other issues with those methods).
      Anthony Watson, Mountain Software
      www.mountainsoftware.com

      Comment


      • #43
        Originally posted by Anthony Watson View Post
        Mike,

        I was actually using InternetOpenURL until just recently. It worked well, but I was running into security issues with several web sites. I would often get "Access Denied" pages, or messages that they were "blocking scraping".

        It might be worth taking another look at...
        You probably need a suitable "User-Agent" http header so that you don't look like a bot

        See the lpszHeaders parameter of InternetOpenURL.
        Last edited by Stuart McLachlan; 23 Nov 2021, 06:29 PM.

        Comment


        • #44
          ????????????????????????????
          Your words!
          Dale

          Comment


          • #45
            Originally posted by Stuart McLachlan View Post

            You probably need a suitable "User-Agent" http header so that you don't look like a bot

            See the lpszHeaders parameter of InternetOpenURL.
            Thanks for the info! I'll research it further when I get the chance.
            Anthony Watson, Mountain Software
            www.mountainsoftware.com

            Comment


            • #46
              Originally posted by Dale Yarker View Post
              ????????????????????????????
              Your words!
              My words? I don't understand what you're not understanding.
              Anthony Watson, Mountain Software
              www.mountainsoftware.com

              Comment


              • #47
                I have the InternetOpenURL method working rather well now. It seems to work with every site I've tried it with, redirects when needed, and doesn't hang like URLDownloadToFile sometimes does (not that I've seen anyway).

                Unfortunately, I discovered some RSS sites are blocking the downloads, stating a DDOS attack. Kind of odd as the whole point of RSS is to download multiple articles from a web site. As a comment I saw on another site said, "humans don't read RSS feeds. Applications do". It's not like I'm hammering the site anyway, I only download the main RSS feed page, then the 5-15 articles on that feed from the last couple weeks. Couldn't be more than 20 pages downloaded per site.

                Ironically, I can download a single page from any of the blocked sites without a problem, using the exact same download subroutine. I even tried purposely inserting a 20-30 second delay between each download and it still triggered the DDOS. Weird.

                I read about similar issues both with other RSS readers, and on sites that were trying to host RSS news feeds. So I'm certainly not alone.

                Unless I can find another solution, I'm just going to ignore the blocked downloads. It's really a problem on the hosting server anyway, and there are plenty of alternative RSS feeds that do work.
                Anthony Watson, Mountain Software
                www.mountainsoftware.com

                Comment

                Working...
                X