CLI: papyri/__init__.py
<node>.kind
to get the type of node.>>> title_node
<Node kind=title, start_point=(1, 0), end_point=(1, 5)>
>>> adornment_node
<Node kind="adornment", start_point=(2, 0), end_point=(2, 5)>
With this you would have to do runtime checks about the types of objects, you can't have static typing, so we can't use mypy.
ts.py
or we can create a ts2.py
, that would not emit the ast in take2.py
instead would emit a new ast in lets say myst.py
, so that then we need to replace things progressively.gen
commandGenerate documentation for a given package. First item should be the root package to import, if subpackages need to be analyzed but are not accessible from the root pass them as extra arguments.
if api:
if examples:
g.collect_examples_out()
if api:
g.collect_api_docs(target_module_name)
if narrative:
g.collect_narrative_docs()
gen
command does the parsing using tree sitter and returns the object from take2.py
papyri gen examples/numpy.toml
: gen
command takes a toml configuration file to generate documentation for a given package.~/.papyri/data
directory.gen_main
: Main entry point to generate docbundle files.collect_api_docs
:
_get_collector
constructs a depth first search collector that will try to find all the objects it can.)collected
was 2667 items.
('numpy', <module 'numpy' ..>)
('numpy.distutils', <module 'numpy.distutils'..>)
helper_1
on the fully qualified name (qa
) and target_item
as in the module for all the collected items. It returns the following three items:
item_docstring
: docstring of the modulearbitrary
: List of papyri.take2.Section
, each section will have the title and items inside that section, this is basically a section in the documentation.api_object
: papyri.gen.APIObjectInfo
a structured object which contains all the information about the parsed documentation, infact api_object.parsed
is equal to arbitrary
.prepare_doc_for_one_object
: gets documentation information for one python object. It resturns the following:
~/.papyri/data/numpy_1.23.4/module/<collected_item.json>
>>> tree
<tree_sitter.Tree object at 0x1090cf3f0>
>>> tree.root_node
<Node kind=document, start_point=(0, 0), end_point=(104, 0)>
>>> tree.root_node.children[0]
<Node kind=section, start_point=(1, 0), end_point=(2, 5)>
>>> tree.root_node.children[0].text
b'NumPy\n====='
>>> tree.root_node.children[1].text
b'Provides\n 1. An array object of arbitrary homogeneous items\n 2. Fast mathematical operations over arrays\n 3. Linear Algebra, Fourier Transforms, Random Number Generation'
>>> tree.root_node.children[2].text
b'How to use the documentation\n----------------------------'
>>> tree.root_node.children[3].text
b'Documentation is available in two forms: docstrings provided\nwith the code, and a loose standing reference guide, available from\n`the NumPy homepage <https://numpy.org>`_.'
>>> tree.root_node.children[0].children
[<Node kind=title, start_point=(1, 0), end_point=(1, 5)>, <Node kind="adornment", start_point=(2, 0), end_point=(2, 5)>]
>>> tree.root_node.children[0].children[0]
<Node kind=title, start_point=(1, 0), end_point=(1, 5)>
>>> tree.root_node.children[0].children[0].children
[<Node kind="text", start_point=(1, 0), end_point=(1, 5)>]
>>> tree.root_node.children[0].children[0].children[0]
<Node kind="text", start_point=(1, 0), end_point=(1, 5)>
>>> tree.root_node.children[0].children[0].children[0].children
[]
>>> tree.root_node.children[0].children[0].children[0].text
b'NumPy'
ts.py
and take2.py
):Node
object.Node
object is then passed to TSVisitor
.visit_document
method of TSVisitor
, which eventually calls the visit
method, which visits all the children.c.type
), we call it kind. For each tree sitter type we have defined a method in the TSVisitor
class named visit_{kind}
.take2.py
visit_{kind}
, which parses the children and returns the respective object from take2.py
.nest_sections
, put things under Section
Node.ingest
commandExample: papyri ingest ~/.papyri/data/numpy_1.23.4
Given paths to a docbundle folder, ingest it into the known libraries.
encoder.encode(doc_blob)
~/.papyri/ingest
directoryas
papyri.db`papyri.db
The papyri.db
database contains the following tables
main.destinations
main.documents
main.links
This is managed by graphstore
module (Class abstraction over the filesystem to store documents in a graph-like structure)
destinations
id | package | version | category | identifier |
---|---|---|---|---|
1 | numpy | 1.23.4 | module | numpy.ndarray |
2 | current-module | current-version | to-resolve | ogrid |
3 | builtins | * | module | builtins.tuple |
4 | numpy | 1.23.4 | module | numpy.indices |
5 | current-module | current-version | to-resolve | mgrid |
6 | numpy | * | module | numpy.ndarray.reshape |
documents
id | package | version | category | identifier |
---|---|---|---|---|
1 | numpy | 1.23.4 | assets | fig-numpy.kaiser-1-ce19905e.png |
2 | numpy | 1.23.4 | assets | fig-numpy.histogram2d-0-3819e7bf.png |
30 | numpy | 1.23.4 | module | numpy.polynomial.hermite.hermfit |
31 | numpy | 1.23.4 | module | numpy.lib.function_base._i0_dispatcher |
32 | numpy | 1.23.4 | module | numpy.lib.index_tricks.MGridClass |
links
id | source | dest | metadata |
---|---|---|---|
1 | 29 | 1 | debug |
2 | 29 | 2 | debug |
3 | 29 | 3 | debug |
4 | 29 | 4 | debug |
5 | 29 | 5 | debug |
render
commandExample: papyri render
This does static rendering of all the given files.
html.tpl.j2
) to render the html.~/.papyri/html/p/numpy/1.23.4/api/<qa>.html
DocBlob Attributes: (Understanding one of the Nodes)
>>> doc_blob.content.keys()
dict_keys(['Attributes', 'Extended Summary', 'Methods', 'Notes', 'Other Parameters', 'Parameters', 'Raises', 'Receives', 'Returns', 'Summary', 'Warnings', 'Warns', 'Yields'])
>>> returns = doc_blob.content['Returns']
>>> type(returns)
<class 'papyri.take2.Section'>
>>> type(returns.children[0])
<class 'papyri.take2.Parameters'>
>>> parameters = returns.children[0]
>>> type(parameters.children[0])
<class 'papyri.take2.Param'>
>>> param = parameters.children[0]
>>> type(param.children[0])
<class 'papyri.take2.Paragraph'>
>>> paragraph = param.children[0]
>>> type(paragraph.children[0])
<class 'papyri.take2.Words'>
>>> words = paragraph.children[0]
>>> words.value
'Chebyshev coefficients ordered from low to high. If '
[Returns - Section]
|
V
[Parameters]
|
V
[Param]
|
V
[Paragraph]
|
V
[Words]
serve
commandExample papyri serve
This serves the rendered html files.
myst-spec is in development; any structures or features present in the JSON schema may change at any time without notice.
Q1: What's rst.so
?
pth = str(Path(__file__).parent / "rst.so")
RST = Language(pth, "rst")
parser = Parser()
parser.set_language(RST)
Q2: Would need to find equivalent of each (almost - with some manual additions) item in the current ast in the myst spec to replace the current ast with myst?
papyri
works.myst.py
(or take3.py
) to return myst ast after tree-sitter parsing.Trying to replace Word/Words with Text from MyST ast
The Words
Node in current AST is different from the Text
in MyST AST. Words is a single word and Text is a continous block of words.
Needs to figure out a way for that single element to pass all the asertions during the construction of the tree, like for example:
# Ref: papyri/tree.py:366
# c is Myst Text and Node is the one defined in take2.py
# Whereas c is an instance of the Node defined in myst_ast.py
assert isinstance(c, Node), c
Trying it on numpy.distutils
papyri gen examples/numpy.toml --only numpy.distutils
[<Section:
|children: [<Paragraph:
| |children: [An enhanced distutils, providing support for Fortran compilers, for BLAS, LAPACK and other common libraries for numerical computing, and more.]
| |>, <Paragraph:
| |children: [Public submodules are: ]
| |>, <BlockVerbatim '47'>, <Paragraph:
| |children: [For details, please see the , *Packaging*, and , *NumPy Distutils User Guide*, sections of the NumPy Reference Guide.]
| |>, <Paragraph:
| |children: [For configuring the preference for and location of libraries like BLAS and LAPACK, and for setting include paths and similar build options, please see , <Verbatim ``site.cfg.example``>, in the root of the NumPy repository or sdist.]
| |>]
|title: None
|level: 0
|target: None
|>]
# nss - above mentioned structure
>>> nss[0].children[0].children[0]
An enhanced distutils, providing support for Fortran compilers, for BLAS, LAPACK and other common libraries for numerical computing, and more.
>>> type(nss[0].children[0].children[0])
<class 'papyri.take2.Words'>
>>> nss
[<Section:
|children: [<Paragraph:
| |children: [<MText:
| | |value: 'An enhanced distutils, providing support for Fortran compilers, for BLAS, LAPACK and other common libraries for numerical computing, and more.'
| | |>]
| |>, <Paragraph:
| |children: [<MText:
| | |value: 'Public submodules are '
| | |>]
| |>, <BlockVerbatim '47'>, <Paragraph:
| |children: [<MText:
| | |value: 'For details, please see the '
| | |>, *Packaging*, <MText:
| | |value: ' and '
| | |>, *NumPy Distutils User Guide*, <MText:
| | |value: ' sections of the NumPy Reference Guide.'
| | |>]
| |>, <Paragraph:
| |children: [<MText:
| | |value: 'For configuring the preference for and location of libraries like BLAS and LAPACK, and for setting include paths and similar build options, please see '
| | |>, <Verbatim ``site.cfg.example``>, <MText:
| | |value: ' in the root of the NumPy repository or sdist.'
| | |>]
| |>]
|title: None
|level: 0
|target: None
|>
]
>>> type(root)
<class 'papyri.ts.Node'>
tsv.visit_document(root)
Word
(s) are compressed into Words
object in the visit_paragraph
function... plot::
:format: png
import matplotlib.pyplot as plt
…
Once parsed by tree sitter:
Directive:
children: #< list of N elements.
Options:
format: png
Code:
Text:
value "import matplotlib.pyplot as plt."
.. plot::
import matplotlib.pyplot as plt
…
Once parsed by tree sitter:
Directive:
children: #< list of N elements.
Code:
Text:
value "import matplotlib.pyplot as plt."
You don't know if your first children is option or not.
Directive:
Options: Option or None
Code: Words.
papyri gen examples/numpy.toml --only numpy.distutils
call.BlockVerbatim
with Code
?To Replace/Remove remaining nodes:
Verbatim
Directive
Link
Math
BlockMath
SubstitutionDef
SubstitutionRef
Target
Unimplemented
Comment
Fig
RefInfo
ListItem
Signature
NumpydocExample
NumpydocSeeAlso
NumpydocSignature
Section
Parameters
Param
Token
Code3
CodeLine
Code2
GenToken
Code
BlockQuote
Transition
Paragraph
Admonition
TocTree
BlockDirective
BlockVerbatim
Options
FieldList
FieldListItem
DefList
DefListItem
SeeAlsoItem