dockerfiles/scrapy/README.md

## WHAT-IS

`Scrapy`: An open source and collaborative framework for extracting the data  
you need from websites.  In a fast, simple, yet extensible way.

This image is based on `debian:jessie`, and it only takes 278.6 MB.  
You can create a scrapy (v0.24.6) project on top of this image.

## HOW-TO

```
$ docker run --name scrapy -it vimagick/scrapy
>>> scrapy startproject demo
>>> cd demo
>>> scrapy genspider example example.com
>>> scrapy edit example
>>> scrapy crawl example
```

## TODO-LIST

- [x] build [libxml2][1]/[libxslt][2] from source
- [x] add [scrapy_bash_completion][3] script

[1]: http://www.xmlsoft.org/downloads.html
[2]: http://git.gnome.org/browse/libxslt/
[3]: https://github.com/scrapy/scrapy/raw/master/extras/scrapy_bash_completion
update 2015-05-28 09:29:27 +08:00			`## WHAT-IS`

			`Scrapy`: An open source and collaborative framework for extracting the data
			`you need from websites. In a fast, simple, yet extensible way.`
add scrapy 2015-05-28 09:07:32 +08:00
update 2015-05-28 12:21:17 +08:00			This image is based on `debian:jessie`, and it only takes 278.6 MB.
			`You can create a scrapy (v0.24.6) project on top of this image.`
add scrapy 2015-05-28 09:07:32 +08:00
update 2015-05-28 09:29:27 +08:00			`## HOW-TO`

			```
update 2015-05-28 12:21:17 +08:00			`$ docker run --name scrapy -it vimagick/scrapy`
			`>>> scrapy startproject demo`
			`>>> cd demo`
			`>>> scrapy genspider example example.com`
			`>>> scrapy edit example`
			`>>> scrapy crawl example`
update 2015-05-28 09:29:27 +08:00			```

			`## TODO-LIST`

update 2015-05-28 10:42:35 +08:00			`- [x] build [libxml2][1]/[libxslt][2] from source`
			`- [x] add [scrapy_bash_completion][3] script`
update 2015-05-28 09:29:27 +08:00
			`[1]: http://www.xmlsoft.org/downloads.html`
			`[2]: http://git.gnome.org/browse/libxslt/`
update 2015-05-28 10:42:35 +08:00			`[3]: https://github.com/scrapy/scrapy/raw/master/extras/scrapy_bash_completion`