mirror of
https://github.com/kellyjonbrazil/jc.git
synced 2025-07-15 01:24:29 +02:00
doc update
This commit is contained in:
117
docs/parsers/clf_s.md
Normal file
117
docs/parsers/clf_s.md
Normal file
@ -0,0 +1,117 @@
|
||||
[Home](https://kellyjonbrazil.github.io/jc/)
|
||||
<a id="jc.parsers.clf_s"></a>
|
||||
|
||||
# jc.parsers.clf\_s
|
||||
|
||||
jc - JSON Convert Common Log Format file streaming parser
|
||||
|
||||
> This streaming parser outputs JSON Lines (cli) or returns an Iterable of
|
||||
> Dictionaries (module)
|
||||
|
||||
This parser will handle the Common Log Format standard as specified at
|
||||
https://www.w3.org/Daemon/User/Config/Logging.html#common-logfile-format.
|
||||
|
||||
Combined Log Format is also supported. (Referer and User Agent fields added)
|
||||
|
||||
Extra fields may be present and will be enclosed in the `extra` field as
|
||||
a single string.
|
||||
|
||||
If a log line cannot be parsed, an object with an `unparsable` field will
|
||||
be present with a value of the original line.
|
||||
|
||||
The `epoch` calculated timestamp field is naive. (i.e. based on the
|
||||
local time of the system the parser is run on)
|
||||
|
||||
The `epoch_utc` calculated timestamp field is timezone-aware and is
|
||||
only available if the timezone field is UTC.
|
||||
|
||||
Usage (cli):
|
||||
|
||||
$ cat file.log | jc --clf-s
|
||||
|
||||
Usage (module):
|
||||
|
||||
import jc
|
||||
|
||||
result = jc.parse('clf_s', common_log_file_output.splitlines())
|
||||
for item in result:
|
||||
# do something
|
||||
|
||||
Schema:
|
||||
|
||||
Empty strings and `-` values are converted to `null`/`None`.
|
||||
|
||||
{
|
||||
"host": string,
|
||||
"ident": string,
|
||||
"authuser": string,
|
||||
"date": string,
|
||||
"day": integer,
|
||||
"month": string,
|
||||
"year": integer,
|
||||
"hour": integer,
|
||||
"minute": integer,
|
||||
"second": integer,
|
||||
"tz": string,
|
||||
"request": string,
|
||||
"request_method": string,
|
||||
"request_url": string,
|
||||
"request_version": string,
|
||||
"status": integer,
|
||||
"bytes": integer,
|
||||
"referer": string,
|
||||
"user_agent": string,
|
||||
"extra": string,
|
||||
"epoch": integer, # [0]
|
||||
"epoch_utc": integer, # [1]
|
||||
"unparsable": string # [2]
|
||||
}
|
||||
|
||||
[0] naive timestamp
|
||||
[1] timezone-aware timestamp. Only available if timezone field is UTC
|
||||
[2] exists if the line was not able to be parsed
|
||||
|
||||
Examples:
|
||||
|
||||
$ cat file.log | jc --clf-s
|
||||
{"host":"127.0.0.1","ident":"user-identifier","authuser":"frank","...}
|
||||
{"host":"1.1.1.2","ident":null,"authuser":null,"date":"11/Nov/2016...}
|
||||
...
|
||||
|
||||
$ cat file.log | jc --clf-s -r
|
||||
{"host":"127.0.0.1","ident":"user-identifier","authuser":"frank","...}
|
||||
{"host":"1.1.1.2","ident":"-","authuser":"-","date":"11/Nov/2016:0...}
|
||||
...
|
||||
|
||||
<a id="jc.parsers.clf_s.parse"></a>
|
||||
|
||||
### parse
|
||||
|
||||
```python
|
||||
@add_jc_meta
|
||||
def parse(data: Iterable[str],
|
||||
raw: bool = False,
|
||||
quiet: bool = False,
|
||||
ignore_exceptions: bool = False) -> StreamingOutputType
|
||||
```
|
||||
|
||||
Main text parsing generator function. Returns an iterable object.
|
||||
|
||||
Parameters:
|
||||
|
||||
data: (iterable) line-based text data to parse
|
||||
(e.g. sys.stdin or str.splitlines())
|
||||
|
||||
raw: (boolean) unprocessed output if True
|
||||
quiet: (boolean) suppress warning messages if True
|
||||
ignore_exceptions: (boolean) ignore parsing exceptions if True
|
||||
|
||||
|
||||
Returns:
|
||||
|
||||
Iterable of Dictionaries
|
||||
|
||||
### Parser Information
|
||||
Compatibility: linux, darwin, cygwin, win32, aix, freebsd
|
||||
|
||||
Version 1.0 by Kelly Brazil (kellyjonbrazil@gmail.com)
|
Reference in New Issue
Block a user