1
0
mirror of https://github.com/kellyjonbrazil/jc.git synced 2025-06-17 00:07:37 +02:00
Files
jc/docs/parsers/universal.md

100 lines
3.5 KiB
Markdown
Raw Normal View History

# Table of Contents
* [jc.parsers.universal](#jc.parsers.universal)
* [simple\_table\_parse](#jc.parsers.universal.simple_table_parse)
* [sparse\_table\_parse](#jc.parsers.universal.sparse_table_parse)
2022-01-25 17:07:47 -08:00
<a id="jc.parsers.universal"></a>
2022-01-20 09:46:24 -08:00
# jc.parsers.universal
2022-01-25 17:07:47 -08:00
2022-03-04 13:35:16 -08:00
jc - JSON Convert universal parsers
2022-01-20 09:46:24 -08:00
2022-01-25 17:07:47 -08:00
<a id="jc.parsers.universal.simple_table_parse"></a>
2022-03-05 12:15:14 -08:00
### simple\_table\_parse
2022-01-25 17:07:47 -08:00
2022-01-20 09:46:24 -08:00
```python
2022-01-26 17:08:03 -08:00
def simple_table_parse(data: List[str]) -> List[Dict]
2022-01-20 09:46:24 -08:00
```
Parse simple tables. There should be no blank cells. The last column
may contain data with spaces.
Example Table:
2022-03-10 15:36:11 -08:00
col1 col2 col3 col4 col5
apple orange pear banana my favorite fruits
carrot squash celery spinach my favorite veggies
chicken beef pork eggs my favorite proteins
2022-01-20 09:46:24 -08:00
2022-03-10 16:50:55 -08:00
[{'col1': 'apple', 'col2': 'orange', 'col3': 'pear', 'col4':
'banana', 'col5': 'my favorite fruits'}, {'col1': 'carrot', 'col2':
'squash', 'col3': 'celery', 'col4': 'spinach', 'col5':
'my favorite veggies'}, {'col1': 'chicken', 'col2': 'beef', 'col3':
'pork', 'col4': 'eggs', 'col5': 'my favorite proteins'}]
2022-01-25 18:03:34 -08:00
Parameters:
2022-01-20 09:46:24 -08:00
2022-01-25 18:03:34 -08:00
data: (list) Text data to parse that has been split into lines
via .splitlines(). Item 0 must be the header row.
Any spaces in header names should be changed to
underscore '_'. You should also ensure headers are
lowercase by using .lower().
2022-01-20 09:46:24 -08:00
2022-01-25 18:03:34 -08:00
Also, ensure there are no blank lines (list items)
in the data.
2022-01-20 09:46:24 -08:00
2022-01-25 18:03:34 -08:00
Returns:
List of Dictionaries
2022-01-20 09:46:24 -08:00
2022-01-25 17:07:47 -08:00
<a id="jc.parsers.universal.sparse_table_parse"></a>
2022-01-20 09:46:24 -08:00
2022-03-05 12:15:14 -08:00
### sparse\_table\_parse
2022-01-20 09:46:24 -08:00
```python
2022-02-01 17:57:12 -08:00
def sparse_table_parse(data: List[str], delim: str = '\u2063') -> List[Dict]
2022-01-20 09:46:24 -08:00
```
Parse tables with missing column data or with spaces in column data.
2022-03-10 16:50:55 -08:00
Blank cells are converted to None in the resulting dictionary. Data
elements must line up within column boundaries.
Example Table:
2022-03-10 15:36:11 -08:00
col1 col2 col3 col4 col5
apple orange fuzzy peach my favorite fruits
green beans celery spinach my favorite veggies
chicken beef brown eggs my favorite proteins
2022-01-20 09:46:24 -08:00
2022-03-10 16:50:55 -08:00
[{'col1': 'apple', 'col2': 'orange', 'col3': None, 'col4':
'fuzzy peach', 'col5': 'my favorite fruits'}, {'col1':
'green beans', 'col2': None, 'col3': 'celery', 'col4': 'spinach',
'col5': 'my favorite veggies'}, {'col1': 'chicken', 'col2': 'beef',
'col3': None, 'col4': 'brown eggs', 'col5': 'my favorite proteins'}]
2022-01-25 18:03:34 -08:00
Parameters:
data: (list) Text data to parse that has been split into lines
via .splitlines(). Item 0 must be the header row.
Any spaces in header names should be changed to
underscore '_'. You should also ensure headers are
lowercase by using .lower(). Do not change the
position of header names as the positions are used
to find the data.
Also, ensure there are no blank lines (list items)
in the data.
delim: (string) Delimiter to use. By default `u\\2063`
(invisible separator) is used since it is unlikely
to ever be seen in terminal output. You can change
this for troubleshooting purposes or if there is a
delimiter conflict with your data.
Returns:
List of Dictionaries
2022-01-20 09:46:24 -08:00