mirror of
https://github.com/MontFerret/ferret.git
synced 2024-12-12 11:15:14 +02:00
Grammar Check
This commit is contained in:
parent
84f7f36b9e
commit
01a8929592
16
README.md
16
README.md
@ -3,15 +3,15 @@
|
||||
![ferret](https://raw.githubusercontent.com/MontFerret/ferret/master/assets/intro.jpg)
|
||||
|
||||
## What is it?
|
||||
```ferret``` is a web scraping system aiming to simplify data extraction from the web for such things like ui testing, machine learning and analytics.
|
||||
Having it's own declarative language, ```ferret``` abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself.
|
||||
```ferret``` is a web scraping system aiming to simplify data extraction from the web for such things like UI testing, machine learning and analytics.
|
||||
Having its own declarative language, ```ferret``` abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself.
|
||||
It's extremely portable, extensible and fast.
|
||||
|
||||
## Show me some code
|
||||
The following example demonstrates the use of dynamic pages.
|
||||
First of all, we load the main Google Search page, type search criteria into an input box and then click a search button.
|
||||
The click action triggers a redirect, so we wait till its end.
|
||||
Once the page gets loaded, we iterate over all elements in search results and assign output to a variable.
|
||||
Once the page gets loaded, we iterate over all elements in search results and assign the output to a variable.
|
||||
The final for loop filters out empty elements that might be because of inaccurate use of selectors.
|
||||
|
||||
```aql
|
||||
@ -49,14 +49,14 @@ RETURN (
|
||||
Nowadays data is everything and who owns data - owns the world.
|
||||
I have worked on multiple data-driven projects where data was an essential part of a system and I realized how cumbersome writing tons of scrapers is.
|
||||
After some time looking for a tool that would let me to not write a code, but just express what data I need, decided to come up with my own solution.
|
||||
```ferret``` project is an ambitious initiative trying to bring universal platform for writing scrapers without any hassle.
|
||||
```ferret``` project is an ambitious initiative trying to bring the universal platform for writing scrapers without any hassle.
|
||||
|
||||
## Inspiration
|
||||
FQL (Ferret Query Language) is heavily inspired by [AQL](https://www.arangodb.com/) (ArangoDB Query Language).
|
||||
But due to the domain specifics, there are some differences in how things work.
|
||||
|
||||
## WIP
|
||||
Be aware, the the project is under heavy development. There is no documentation and some things may change in the final release.
|
||||
Be aware, that the project is under heavy development. There is no documentation and some things may change in the final release.
|
||||
For query syntax, you may go to [ArangoDB web site](https://docs.arangodb.com/3.3/AQL/index.html) and use AQL docs as docs for FQL - since they are identical.
|
||||
|
||||
|
||||
@ -107,7 +107,7 @@ Please use `Ctrl-D` to exit this program.
|
||||
|
||||
```
|
||||
|
||||
**Note:** symbol ```%``` is used to start and end multi line queries. You also can use heredoc format.
|
||||
**Note:** symbol ```%``` is used to start and end multi-line queries. You also can use the heredoc format.
|
||||
|
||||
If you want to execute a query stored in a file, just pass a file name:
|
||||
|
||||
@ -126,7 +126,7 @@ ferret < ./docs/examples/static-page.fql
|
||||
|
||||
### Browser mode
|
||||
|
||||
By default, ``ferret`` loads HTML pages via http protocol, because it's faster.
|
||||
By default, ``ferret`` loads HTML pages via HTTP protocol, because it's faster.
|
||||
But nowadays, there are more and more websites rendered with JavaScript, and therefore, this 'old school' approach does not really work.
|
||||
For such cases, you may fetch documents using Chrome or Chromium via Chrome DevTools protocol (aka CDP).
|
||||
First, you need to make sure that you launched Chrome with ```remote-debugging-port=9222``` flag.
|
||||
@ -345,7 +345,7 @@ func getStrings() ([]string, error) {
|
||||
}
|
||||
```
|
||||
|
||||
On top of that, you can completely turn off standard library, by passing the following option:
|
||||
On top of that, you can completely turn off the standard library, bypassing the following option:
|
||||
|
||||
```go
|
||||
comp := compiler.New(compiler.WithoutStdlib())
|
||||
|
Loading…
Reference in New Issue
Block a user