1
0
mirror of https://github.com/kellyjonbrazil/jc.git synced 2025-06-17 00:07:37 +02:00
Files
jc/docs/parsers/csv_s.md
2022-03-05 12:15:14 -08:00

2.7 KiB

Home

jc.parsers.csv_s

jc - JSON Convert csv file streaming parser

This streaming parser outputs JSON Lines

The csv streaming parser will attempt to automatically detect the delimiter character. If the delimiter cannot be detected it will default to comma. The first row of the file must be a header row.

Note: The first 100 rows are read into memory to enable delimiter detection, then the rest of the rows are loaded lazily.

Usage (cli):

$ cat file.csv | jc --csv-s

Usage (module):

import jc
# result is an iterable object (generator)
result = jc.parse('csv_s', csv_output.splitlines())
for item in result:
    # do something

or

import jc.parsers.csv_s
# result is an iterable object (generator)
result = jc.parsers.csv_s.parse(csv_output.splitlines())
for item in result:
    # do something

Schema:

csv file converted to a Dictionary:
https://docs.python.org/3/library/csv.html

{
  "column_name1":     string,
  "column_name2":     string,

  # below object only exists if using -qq or ignore_exceptions=True

  "_jc_meta":
    {
      "success":      boolean,     # false if error parsing
      "error":        string,      # exists if "success" is false
      "line":         string       # exists if "success" is false
    }
}

Examples:

$ cat homes.csv
"Sell", "List", "Living", "Rooms", "Beds", "Baths", "Age", "Acres"...
142, 160, 28, 10, 5, 3,  60, 0.28,  3167
175, 180, 18,  8, 4, 1,  12, 0.43,  4033
129, 132, 13,  6, 3, 1,  41, 0.33,  1471
...

$ cat homes.csv | jc --csv-s
{"Sell":"142","List":"160","Living":"28","Rooms":"10","Beds":"5"...}
{"Sell":"175","List":"180","Living":"18","Rooms":"8","Beds":"4"...}
{"Sell":"129","List":"132","Living":"13","Rooms":"6","Beds":"3"...}
...

parse

@add_jc_meta
def parse(data, raw=False, quiet=False, ignore_exceptions=False)

Main text parsing generator function. Returns an iterator object.

Parameters:

data:              (iterable)  line-based text data to parse
                               (e.g. sys.stdin or str.splitlines())

raw:               (boolean)   unprocessed output if True
quiet:             (boolean)   suppress warning messages if True
ignore_exceptions: (boolean)   ignore parsing exceptions if True

Yields:

Dictionary. Raw or processed structured data.

Returns:

Iterator object (generator)

Parser Information

Compatibility: linux, darwin, cygwin, win32, aix, freebsd

Version 1.3 by Kelly Brazil (kellyjonbrazil@gmail.com)