Some article websites (I’m looking at msn.com right now, as an example) show the first page or so of article content and then have a “Continue Reading” button, which you must click to see the rest of the article. This seems so ridiculous, from a UX perspective–I know how to scroll down to continue reading, so why hide the text and make me click a button, then have me scroll? Why has this become a fairly common practice?

  • Spzi
    link
    184 months ago

    Just a guess: to prevent bots from scraping the full content?

    • @dual_sport_dork@lemmy.world
      link
      fedilink
      9
      edit-2
      4 months ago

      Doubt it. My web analytics indicate that bots click on every single element on the page, whether it makes sense or not.

      For this reason it’s a good idea not to allow your site to generate any kind of circular self-referential loop that can be achieved via navigation or clicking on things, because poorly coded bots will not realize that they’re driving themselves around in circles and proceed to bombard your server with zillions of requests per second for the same thing over and over again.

      Likewise, if you have any user initiated action that can generate an arbitrary result set like for example adding an item or set of items to a quote or cart, it is imperative that you set an upperbound limit on the length of result or request size (server side!), and ideally configure your server to temp-ban a client who attempts too many requests that are too large in too short of a time span. Because if you don’t, bad bots absolutely will eventually attempt to e.g. create a shopping cart with 99999999999999999 items in it. Or a search query with 4.7 gigabytes worth of keywords. Or whatever. Either because they’re coded by morons or worse, because they’re coded by someone who wants to see if they can break your site by doing stuff like that.

      • @petrol_sniff_king@lemmy.blahaj.zone
        link
        fedilink
        2
        edit-2
        4 months ago

        it’s a good idea not to allow your site to generate any kind of circular self-referential loop that can be achieved via navigation or clicking on things

        Don’t nearly all sites have a logo at the top that will take you back to the homepage? I’m not really following.

        My intuition is that the only safe solution is to rate limit requests; a poorly coded bot could definitely just be a while loop for the same URL ad infinitum.

        [e] Unless there’s something to this I’m not thinking about.