{"id":49,"date":"2023-08-22T15:12:16","date_gmt":"2023-08-22T15:12:16","guid":{"rendered":"https:\/\/leaddata.abovethefold.ch\/?page_id=49"},"modified":"2023-10-04T21:13:28","modified_gmt":"2023-10-04T21:13:28","slug":"prevent-crawling","status":"publish","type":"page","link":"https:\/\/backend.bizlist.ai\/prevent-crawling\/","title":{"rendered":"Preventing bizlistBot from Crawling Your Website"},"content":{"rendered":"\n
At bizlist.ai, we value and prioritize every company's autonomy and privacy wishes. If you'd rather keep our LeaddataBot from crawling your website's data, it's simple to instruct our crawler to steer clear of your domain. This can be done by making a modification to your website's Here\u2019s your guide on how to keep bizlistBot at bay:<\/p>\n\n\n\n The These lines tell LeaddataBot to refrain from crawling any section of your website.<\/p>\n\n\n\n While bizlistBot will honor the rules in your file, it's worth noting that not all crawlers are as courteous. Ensure you regularly review and update your robots.txt file to address your site\u2019s evolving requirements.<\/p>\n\n\n\nrobots.txt<\/code> file.<\/p>\n\n\n\n
What is
robots.txt<\/code>?<\/h3>\n\n\n\n
robots.txt<\/code> file is a standard protocol used by websites to communicate with web crawlers and other web robots about which pages on their site should not be processed or scanned. By specifying user-agents and directives, you can guide which sections of your site (or the entirety of it) you'd like to be off-limits to particular bots.<\/p>\n\n\n\n
Steps to Block LeaddataBot:<\/h3>\n\n\n\n
\n
robots.txt<\/code> file:<\/strong> Typically, this file is found in the root directory of your website. If one doesn't exist, it's a straightforward process to create a plain text file named
robots.txt<\/code>.<\/li>\n\n\n\n
robots.txt<\/code> file:<\/li>\n<\/ol>\n\n\n\n
User-agent: bizlistBot Disallow: \/<\/code><\/p>\n\n\n\n
\n
Disallow<\/code> directive. For instance, to prevent LeaddataBot from accessing a directory named \"private\", use:<\/li>\n<\/ol>\n\n\n\n
User-agent: bizlistBot Disallow: \/private\/<\/code><\/p>\n\n\n\n
\n
robots.txt<\/code> from within the CMS settings. For WordPress<\/strong> users, if you have the Yoast SEO plugin<\/strong>, it provides an easy interface to edit your
robots.txt<\/code>. Here's a step-by-step guide from Yoast on how to edit the robots.txt through Yoast SEO<\/a>.<\/li>\n\n\n\n
robots.txt<\/code> file and make sure it's uploaded to your website's root directory.<\/li>\n\n\n\n
robots.txt<\/code> is correctly implemented by visiting
http:\/\/www.yourdomain.com\/robots.txt<\/code>. Replace \"yourdomain\" with your actual domain name.<\/li>\n<\/ol>\n\n\n\n