mirror of
https://github.com/kellyjonbrazil/jc.git
synced 2025-06-23 00:29:59 +02:00
doc update for streaming CSV parser
This commit is contained in:
@ -115,6 +115,7 @@ The JSON output can be compact (default) or pretty formatted with the `-p` optio
|
|||||||
- `--crontab` enables the `crontab` command and file parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/crontab))
|
- `--crontab` enables the `crontab` command and file parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/crontab))
|
||||||
- `--crontab-u` enables the `crontab` file parser with user support ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/crontab_u))
|
- `--crontab-u` enables the `crontab` file parser with user support ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/crontab_u))
|
||||||
- `--csv` enables the CSV file parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/csv))
|
- `--csv` enables the CSV file parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/csv))
|
||||||
|
- `--csv-s` enables the CSV file streaming parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/csv_s))
|
||||||
- `--date` enables the `date` command parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/date))
|
- `--date` enables the `date` command parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/date))
|
||||||
- `--df` enables the `df` command parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/df))
|
- `--df` enables the `df` command parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/df))
|
||||||
- `--dig` enables the `dig` command parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/dig))
|
- `--dig` enables the `dig` command parser ([documentation](https://kellyjonbrazil.github.io/jc/docs/parsers/dig))
|
||||||
|
75
docs/parsers/csv_s.md
Normal file
75
docs/parsers/csv_s.md
Normal file
@ -0,0 +1,75 @@
|
|||||||
|
[Home](https://kellyjonbrazil.github.io/jc/)
|
||||||
|
|
||||||
|
# jc.parsers.csv_s
|
||||||
|
jc - JSON CLI output utility `csv` file streaming parser
|
||||||
|
|
||||||
|
The `csv` streaming parser will attempt to automatically detect the delimiter character. If the delimiter cannot be detected it will default to comma. The first row of the file must be a header row.
|
||||||
|
|
||||||
|
Note: The first 100 rows are read into memory to enable delimiter detection, then the rest of the rows are loaded lazily.
|
||||||
|
|
||||||
|
Usage (cli):
|
||||||
|
|
||||||
|
$ cat file.csv | jc --csv-s
|
||||||
|
|
||||||
|
Usage (module):
|
||||||
|
|
||||||
|
import jc.parsers.csv_s
|
||||||
|
result = jc.parsers.csv_s.parse(csv_output)
|
||||||
|
|
||||||
|
Schema:
|
||||||
|
|
||||||
|
csv file converted to a Dictionary: https://docs.python.org/3/library/csv.html
|
||||||
|
|
||||||
|
{
|
||||||
|
"column_name1": string,
|
||||||
|
"column_name2": string
|
||||||
|
}
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
$ cat homes.csv
|
||||||
|
"Sell", "List", "Living", "Rooms", "Beds", "Baths", "Age", "Acres", "Taxes"
|
||||||
|
142, 160, 28, 10, 5, 3, 60, 0.28, 3167
|
||||||
|
175, 180, 18, 8, 4, 1, 12, 0.43, 4033
|
||||||
|
129, 132, 13, 6, 3, 1, 41, 0.33, 1471
|
||||||
|
...
|
||||||
|
|
||||||
|
$ cat homes.csv | jc --csv-s
|
||||||
|
{"Sell":"142","List":"160","Living":"28","Rooms":"10","Beds":"5","Baths":"3","Age":"60","Acres":"0.28","Taxes":"3167"}
|
||||||
|
{"Sell":"175","List":"180","Living":"18","Rooms":"8","Beds":"4","Baths":"1","Age":"12","Acres":"0.43","Taxes":"4033"}
|
||||||
|
{"Sell":"129","List":"132","Living":"13","Rooms":"6","Beds":"3","Baths":"1","Age":"41","Acres":"0.33","Taxes":"1471"}
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
## info
|
||||||
|
```python
|
||||||
|
info()
|
||||||
|
```
|
||||||
|
Provides parser metadata (version, author, etc.)
|
||||||
|
|
||||||
|
## parse
|
||||||
|
```python
|
||||||
|
parse(data, raw=False, quiet=False, ignore_exceptions=False)
|
||||||
|
```
|
||||||
|
|
||||||
|
Main text parsing generator function. Returns an iterator object.
|
||||||
|
|
||||||
|
Parameters:
|
||||||
|
|
||||||
|
data: (iterable) line-based text data to parse (e.g. sys.stdin or str.splitlines())
|
||||||
|
raw: (boolean) output preprocessed JSON if True
|
||||||
|
quiet: (boolean) suppress warning messages if True
|
||||||
|
ignore_exceptions: (boolean) ignore parsing exceptions if True
|
||||||
|
|
||||||
|
Yields:
|
||||||
|
|
||||||
|
Dictionary. Raw or processed structured data.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
|
||||||
|
Iterator object
|
||||||
|
|
||||||
|
## Parser Information
|
||||||
|
Compatibility: linux, darwin, cygwin, win32, aix, freebsd
|
||||||
|
|
||||||
|
Version 1.0 by Kelly Brazil (kellyjonbrazil@gmail.com)
|
@ -1,12 +1,12 @@
|
|||||||
"""jc - JSON CLI output utility `csv` file streaming parser
|
"""jc - JSON CLI output utility `csv` file streaming parser
|
||||||
|
|
||||||
The `csv` parser will attempt to automatically detect the delimiter character. If the delimiter cannot be detected it will default to comma. The first row of the file must be a header row.
|
The `csv` streaming parser will attempt to automatically detect the delimiter character. If the delimiter cannot be detected it will default to comma. The first row of the file must be a header row.
|
||||||
|
|
||||||
Note: The first 100 rows are read into memory to enable delimiter detection. Then the rest of the rows are loaded lazily.
|
Note: The first 100 rows are read into memory to enable delimiter detection, then the rest of the rows are loaded lazily.
|
||||||
|
|
||||||
Usage (cli):
|
Usage (cli):
|
||||||
|
|
||||||
$ cat file.csv | jc --csv
|
$ cat file.csv | jc --csv-s
|
||||||
|
|
||||||
Usage (module):
|
Usage (module):
|
||||||
|
|
||||||
@ -31,7 +31,7 @@ Examples:
|
|||||||
129, 132, 13, 6, 3, 1, 41, 0.33, 1471
|
129, 132, 13, 6, 3, 1, 41, 0.33, 1471
|
||||||
...
|
...
|
||||||
|
|
||||||
$ cat homes.csv | jc --csv_s
|
$ cat homes.csv | jc --csv-s
|
||||||
{"Sell":"142","List":"160","Living":"28","Rooms":"10","Beds":"5","Baths":"3","Age":"60","Acres":"0.28","Taxes":"3167"}
|
{"Sell":"142","List":"160","Living":"28","Rooms":"10","Beds":"5","Baths":"3","Age":"60","Acres":"0.28","Taxes":"3167"}
|
||||||
{"Sell":"175","List":"180","Living":"18","Rooms":"8","Beds":"4","Baths":"1","Age":"12","Acres":"0.43","Taxes":"4033"}
|
{"Sell":"175","List":"180","Living":"18","Rooms":"8","Beds":"4","Baths":"1","Age":"12","Acres":"0.43","Taxes":"4033"}
|
||||||
{"Sell":"129","List":"132","Living":"13","Rooms":"6","Beds":"3","Baths":"1","Age":"41","Acres":"0.33","Taxes":"1471"}
|
{"Sell":"129","List":"132","Living":"13","Rooms":"6","Beds":"3","Baths":"1","Age":"41","Acres":"0.33","Taxes":"1471"}
|
||||||
@ -75,48 +75,6 @@ def _process(proc_data):
|
|||||||
return proc_data
|
return proc_data
|
||||||
|
|
||||||
|
|
||||||
def old_parse(data, raw=False, quiet=False):
|
|
||||||
"""
|
|
||||||
Main text parsing function
|
|
||||||
|
|
||||||
Parameters:
|
|
||||||
|
|
||||||
data: (string) text data to parse
|
|
||||||
raw: (boolean) output preprocessed JSON if True
|
|
||||||
quiet: (boolean) suppress warning messages if True
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
|
|
||||||
List of Dictionaries. Raw or processed structured data.
|
|
||||||
"""
|
|
||||||
if not quiet:
|
|
||||||
jc.utils.compatibility(__name__, info.compatible)
|
|
||||||
|
|
||||||
raw_output = []
|
|
||||||
cleandata = data.splitlines()
|
|
||||||
|
|
||||||
# Clear any blank lines
|
|
||||||
cleandata = list(filter(None, cleandata))
|
|
||||||
|
|
||||||
if jc.utils.has_data(data):
|
|
||||||
|
|
||||||
dialect = None
|
|
||||||
try:
|
|
||||||
dialect = csv.Sniffer().sniff(data[:1024])
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
reader = csv.DictReader(cleandata, dialect=dialect)
|
|
||||||
|
|
||||||
for row in reader:
|
|
||||||
raw_output.append(row)
|
|
||||||
|
|
||||||
if raw:
|
|
||||||
return raw_output
|
|
||||||
else:
|
|
||||||
return _process(raw_output)
|
|
||||||
|
|
||||||
|
|
||||||
def parse(data, raw=False, quiet=False, ignore_exceptions=False):
|
def parse(data, raw=False, quiet=False, ignore_exceptions=False):
|
||||||
"""
|
"""
|
||||||
Main text parsing generator function. Returns an iterator object.
|
Main text parsing generator function. Returns an iterator object.
|
||||||
@ -153,7 +111,7 @@ def parse(data, raw=False, quiet=False, ignore_exceptions=False):
|
|||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# chain `temp_list` and `data` together to lazy load all of the CSV data
|
# chain `temp_list` and `data` together to lazy load the rest of the CSV data
|
||||||
new_data = itertools.chain(temp_list, data)
|
new_data = itertools.chain(temp_list, data)
|
||||||
reader = csv.DictReader(new_data, dialect=dialect)
|
reader = csv.DictReader(new_data, dialect=dialect)
|
||||||
|
|
||||||
|
7
man/jc.1
7
man/jc.1
@ -1,4 +1,4 @@
|
|||||||
.TH jc 1 2021-10-23 1.17.1 "JSON CLI output utility"
|
.TH jc 1 2021-10-24 1.17.1 "JSON CLI output utility"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
jc \- JSONifies the output of many CLI tools and file-types
|
jc \- JSONifies the output of many CLI tools and file-types
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
@ -62,6 +62,11 @@ Parsers:
|
|||||||
\fB--csv\fP
|
\fB--csv\fP
|
||||||
CSV file parser
|
CSV file parser
|
||||||
|
|
||||||
|
.TP
|
||||||
|
.B
|
||||||
|
\fB--csv-s\fP
|
||||||
|
CSV file streaming parser
|
||||||
|
|
||||||
.TP
|
.TP
|
||||||
.B
|
.B
|
||||||
\fB--date\fP
|
\fB--date\fP
|
||||||
|
Reference in New Issue
Block a user