deadlink-crawler

[unmaintained] crawls a site to detect dead links
Log | Files | Refs | README

commit 8576e54b93f4c717c6b9c1e94fccc28d6230780e
parent 74728c39ce266ce4bf18270098064f918da480cd
Author: Stefan <stefan@eliteinformatiker.de>
Date:   Thu, 24 Jan 2013 14:17:11 +0100

added some readme info

Diffstat:
MREADME.md | 24+++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md @@ -1,4 +1,23 @@ -deadlink-crawler +Deadlink crawler ================ -This is a small crawler searching your website for deadlinks.- \ No newline at end of file +This is a small crawler searching your website for deadlinks. + +You can use it by creating a new instance of the class and running the crawler. The crawler class supports different options. + +```python +# Begin crawling at example.com +c = Crawler("http://example.com/") + +# Restrict crawling only to your own domain +c.set_url_restrict("http://example.com/.*") + +# Set a second wait time between each URL to avoid putting +# too much load on your website. But usually on personal PCs +# this should not matter, because our crawler is not distributed +# and your bandwidth is small. +c.set_wait_time(1) + +# start the crawling process +c.crawl() +```