1
0
mirror of https://github.com/lorien/awesome-web-scraping.git synced 2024-11-24 08:32:19 +02:00

Merge pull request #34 from gingerhot/master

add Golang to the list
This commit is contained in:
Gregory Petukhov 2017-05-14 21:58:52 +06:00 committed by GitHub
commit 5f4efe9147
2 changed files with 123 additions and 0 deletions

View File

@ -11,6 +11,7 @@ The list of tools, programming libraries and APIs used in web-scraping.
* [PHP](http://github.com/lorien/web-scraping/blob/master/php.md)
* [Ruby](http://github.com/lorien/web-scraping/blob/master/ruby.md)
* [JavaScript](http://github.com/lorien/web-scraping/blob/master/javascript.md)
* [Golang](http://github.com/lorien/web-scraping/blob/master/golang.md)
* Feel free to add your favourite language. Use [new_language_template.md](http://github.com/lorien/web-scraping/blob/master/new_language_template.md) as start point.
* [Proxy Services](http://github.com/lorien/web-scraping/blob/master/proxy_services.md)
* [Web Services](http://github.com/lorien/web-scraping/blob/master/web_services.md)

122
golang.md Normal file
View File

@ -0,0 +1,122 @@
# Golang Web Scraping
This list contains Golang libraries related to web scraping and data processing
* [Golang Web Scraping](#javascript-web-scraping)
* [Network](#network)
* [Web-scraping Frameworks](#web-scraping-frameworks)
* [HTML/XML Parsing](#htmlxml-parsing)
* [Text processing](#text-processing)
* [Specific Formats Processing](#specific-formats-processing)
* [Natural Language Processing](#natural-language-processing)
* [Browser automation and emulation](#browser-automation-and-emulation)
* [Multiprocessing](#multiprocessing)
* [Queue](#queue)
* [Email](#email)
* [URL and Network Address Manipulation](#url-and-network-address-manipulation)
* [Web Content Extracting](#web-content-extracting)
* [Asynchronous](#asynchronous)
* [WebSocket](#websocket)
* [DNS Resolving](#dns-resolving)
* [Computer Vision](#computer-vision)
* [Proxy Server](#proxy-server)
* [Other Golang Lists](#other-Golang-lists)
## Network
* General
* [net](https://golang.org/pkg/net/)
* [net/http](https://golang.org/pkg/net/http/)
* Asynchronous
* [goroutine](https://tour.golang.org/concurrency/1)
## Web-Scraping Frameworks
* Full Featured Crawlers
* [Pholcus](https://github.com/henrylee2cn/pholcus)
* [go_spider](https://github.com/hu17889/go_spider)
* [ants-go](https://github.com/wcong/ants-go)
* Other
* TODO
## HTML/XML Parsing
* [encoding/xml](https://golang.org/pkg/encoding/xml/)
## Text Processing
*Libraries for parsing and manipulating plain texts.*
* General
* [regexp](https://golang.org/pkg/regexp/)
## Specific Formats Processing
*Libraries for parsing and manipulating specific text formats.*
* General
* [encoding/json](https://golang.org/pkg/encoding/json/)
* Something
* TODO
## Natural Language Processing
*Libraries for working with human languages.*
* TODO
## Browser automation and emulation
* TODO
## Multiprocessing
* TODO
## Asynchronous
*Libraries for asynchronous networking programming.*
* TODO
## Queue
* [NSQ](https://github.com/nsqio/nsq)
* [NATS](https://github.com/nats-io/go-nats)
## Email
*Libraries for parsing email.*
* TODO
## URL and Network Address Manipulation
*Libraries for parsing/modifying URLs and network addresses.*
* URL
* [net/url](https://golang.org/pkg/net/url/)
* Network Address
* TODO
## Web Content Extracting
*Libraries for extracting web contents.*
* Text and Meta Data from HTML pages
* [x/net/html](golang.org/x/net/html)
## WebSocket
*Libraries for working with WebSocket.*
* [gorilla/websocket](https://github.com/gorilla/websocket)
## DNS Resolving
* [net](https://golang.org/pkg/net/)
* [miekg/dns](https://github.com/miekg/dns)
## Computer Vision
* TODO
## Proxy Server
* TODO
## Other Golang lists
* TODO