1
0
mirror of https://github.com/kellyjonbrazil/jc.git synced 2025-07-09 01:05:53 +02:00

add support for unparsable lines

This commit is contained in:
Kelly Brazil
2022-11-21 12:09:19 -08:00
parent 60f1e79b2f
commit 26f8803b23
4 changed files with 20 additions and 1 deletions

View File

@ -13,6 +13,9 @@ Combined Log Format is also supported. (Referer and User Agent fields added)
Extra fields may be present and will be enclosed in the `extra` field as Extra fields may be present and will be enclosed in the `extra` field as
a single string. a single string.
If a log line cannot be parsed, an object with an `unparsable` field will
be present with a value of the original line.
The `epoch` calculated timestamp field is naive. (i.e. based on the The `epoch` calculated timestamp field is naive. (i.e. based on the
local time of the system the parser is run on) local time of the system the parser is run on)
@ -56,11 +59,13 @@ Empty strings and `-` values are converted to `null`/`None`.
"extra": string, "extra": string,
"epoch": integer, # [0] "epoch": integer, # [0]
"epoch_utc": integer # [1] "epoch_utc": integer # [1]
"unparsable": string # [2]
} }
] ]
[0] naive timestamp [0] naive timestamp
[1] timezone-aware timestamp. Only available if timezone field is UTC [1] timezone-aware timestamp. Only available if timezone field is UTC
[2] exists if the line was not able to be parsed
Examples: Examples:

View File

@ -8,6 +8,9 @@ Combined Log Format is also supported. (Referer and User Agent fields added)
Extra fields may be present and will be enclosed in the `extra` field as Extra fields may be present and will be enclosed in the `extra` field as
a single string. a single string.
If a log line cannot be parsed, an object with an `unparsable` field will
be present with a value of the original line.
The `epoch` calculated timestamp field is naive. (i.e. based on the The `epoch` calculated timestamp field is naive. (i.e. based on the
local time of the system the parser is run on) local time of the system the parser is run on)
@ -51,11 +54,13 @@ Empty strings and `-` values are converted to `null`/`None`.
"extra": string, "extra": string,
"epoch": integer, # [0] "epoch": integer, # [0]
"epoch_utc": integer # [1] "epoch_utc": integer # [1]
"unparsable": string # [2]
} }
] ]
[0] naive timestamp [0] naive timestamp
[1] timezone-aware timestamp. Only available if timezone field is UTC [1] timezone-aware timestamp. Only available if timezone field is UTC
[2] exists if the line was not able to be parsed
Examples: Examples:
@ -189,4 +194,9 @@ def parse(
raw_output.append(output_line) raw_output.append(output_line)
else:
raw_output.append(
{"unparsable": line}
)
return raw_output if raw else _process(raw_output) return raw_output if raw else _process(raw_output)

File diff suppressed because one or more lines are too long

View File

@ -4,12 +4,16 @@
127.0.0.1 - - [11/Nov/2016:14:23:37 +0100] "GET /uno dos HTTP/1.0" 404 298 "-" "-" - 385111 1.1.1.1 127.0.0.1 - - [11/Nov/2016:14:23:37 +0100] "GET /uno dos HTTP/1.0" 404 298 "-" "-" - 385111 1.1.1.1
1.1.1.1 - - [11/Nov/2016:00:00:11 +0100] "GET /icc HTTP/1.1" 302 - "-" "XXX XXX XXX" - 6160 11.1.1.1 1.1.1.1 - - [11/Nov/2016:00:00:11 +0100] "GET /icc HTTP/1.1" 302 - "-" "XXX XXX XXX" - 6160 11.1.1.1
1.1.1.1 - - [11/Nov/2016:00:00:11 +0100] "GET /icc/ HTTP/1.1" 302 - "-" "XXX XXX XXX" - 2981 1.1.1.1 1.1.1.1 - - [11/Nov/2016:00:00:11 +0100] "GET /icc/ HTTP/1.1" 302 - "-" "XXX XXX XXX" - 2981 1.1.1.1
unparsable line
tarpon.gulf.net - - [12/Jan/1996:20:37:55 +0000] "GET index.htm HTTP/1.0" 200 215 tarpon.gulf.net - - [12/Jan/1996:20:37:55 +0000] "GET index.htm HTTP/1.0" 200 215
tarpon.gulf.net - - [12/Jan/1996:20:37:56 +0000] "POST products.htm HTTP/1.0" 200 215 tarpon.gulf.net - - [12/Jan/1996:20:37:56 +0000] "POST products.htm HTTP/1.0" 200 215
tarpon.gulf.net - - [12/Jan/1996:20:37:57 +0000] "PUT sales.htm HTTP/1.0" 200 215 tarpon.gulf.net - - [12/Jan/1996:20:37:57 +0000] "PUT sales.htm HTTP/1.0" 200 215
tarpon.gulf.net - - [12/Jan/1996:20:37:58 +0000] "GET /images/log.gif HTTP/1.0" 200 215 tarpon.gulf.net - - [12/Jan/1996:20:37:58 +0000] "GET /images/log.gif HTTP/1.0" 200 215
tarpon.gulf.net - - [12/Jan/1996:20:37:59 +0000] "GET /buttons/form.gif HTTP/1.0" 200 215 tarpon.gulf.net - - [12/Jan/1996:20:37:59 +0000] "GET /buttons/form.gif HTTP/1.0" 200 215
66.249.66.1 - - [01/Jan/2017:09:00:00 +0000] "GET /contact.html HTTP/1.1" 200 250 66.249.66.1 - - [01/Jan/2017:09:00:00 +0000] "GET /contact.html HTTP/1.1" 200 250
another unparsable line
66.249.66.1 - - [01/Jan/2017:09:00:00 +0000] "GET /contact.html HTTP/1.1" 200 250 "http://www.example.com/" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.66.1 - - [01/Jan/2017:09:00:00 +0000] "GET /contact.html HTTP/1.1" 200 250 "http://www.example.com/" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
jay.bird.com - fred [25/Dec/1998:17:45:35 +0000] "GET /~sret1/ HTTP/1.0" 200 1243 jay.bird.com - fred [25/Dec/1998:17:45:35 +0000] "GET /~sret1/ HTTP/1.0" 200 1243