1
0
mirror of https://github.com/MontFerret/ferret.git synced 2024-12-12 11:15:14 +02:00

Grammar Check

This commit is contained in:
Prateek Chaplot 2018-10-05 14:47:22 +05:30
parent 84f7f36b9e
commit 01a8929592

View File

@ -3,15 +3,15 @@
![ferret](https://raw.githubusercontent.com/MontFerret/ferret/master/assets/intro.jpg)
## What is it?
```ferret``` is a web scraping system aiming to simplify data extraction from the web for such things like ui testing, machine learning and analytics.
Having it's own declarative language, ```ferret``` abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself.
```ferret``` is a web scraping system aiming to simplify data extraction from the web for such things like UI testing, machine learning and analytics.
Having its own declarative language, ```ferret``` abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself.
It's extremely portable, extensible and fast.
## Show me some code
The following example demonstrates the use of dynamic pages.
First of all, we load the main Google Search page, type search criteria into an input box and then click a search button.
The click action triggers a redirect, so we wait till its end.
Once the page gets loaded, we iterate over all elements in search results and assign output to a variable.
Once the page gets loaded, we iterate over all elements in search results and assign the output to a variable.
The final for loop filters out empty elements that might be because of inaccurate use of selectors.
```aql
@ -49,14 +49,14 @@ RETURN (
Nowadays data is everything and who owns data - owns the world.
I have worked on multiple data-driven projects where data was an essential part of a system and I realized how cumbersome writing tons of scrapers is.
After some time looking for a tool that would let me to not write a code, but just express what data I need, decided to come up with my own solution.
```ferret``` project is an ambitious initiative trying to bring universal platform for writing scrapers without any hassle.
```ferret``` project is an ambitious initiative trying to bring the universal platform for writing scrapers without any hassle.
## Inspiration
FQL (Ferret Query Language) is heavily inspired by [AQL](https://www.arangodb.com/) (ArangoDB Query Language).
But due to the domain specifics, there are some differences in how things work.
## WIP
Be aware, the the project is under heavy development. There is no documentation and some things may change in the final release.
Be aware, that the project is under heavy development. There is no documentation and some things may change in the final release.
For query syntax, you may go to [ArangoDB web site](https://docs.arangodb.com/3.3/AQL/index.html) and use AQL docs as docs for FQL - since they are identical.
@ -107,7 +107,7 @@ Please use `Ctrl-D` to exit this program.
```
**Note:** symbol ```%``` is used to start and end multi line queries. You also can use heredoc format.
**Note:** symbol ```%``` is used to start and end multi-line queries. You also can use the heredoc format.
If you want to execute a query stored in a file, just pass a file name:
@ -126,7 +126,7 @@ ferret < ./docs/examples/static-page.fql
### Browser mode
By default, ``ferret`` loads HTML pages via http protocol, because it's faster.
By default, ``ferret`` loads HTML pages via HTTP protocol, because it's faster.
But nowadays, there are more and more websites rendered with JavaScript, and therefore, this 'old school' approach does not really work.
For such cases, you may fetch documents using Chrome or Chromium via Chrome DevTools protocol (aka CDP).
First, you need to make sure that you launched Chrome with ```remote-debugging-port=9222``` flag.
@ -345,7 +345,7 @@ func getStrings() ([]string, error) {
}
```
On top of that, you can completely turn off standard library, by passing the following option:
On top of that, you can completely turn off the standard library, bypassing the following option:
```go
comp := compiler.New(compiler.WithoutStdlib())