Searchability

The Second Pillar of Open Source Research

Searchability means that research must be fully machine-readable, structurally explicit, and parseable.

1. Machine-Readable Structure

Documents must be encoded in a structured canonical format (Pandoc JSON AST). All sections, tables, figures, citations, and equations must be explicitly defined.

2. Structured Objects

Tables and images must not merely appear visually — they must exist as structured data.

  • Tables encoded natively or imported from CSV
  • Figures generated from data where possible
  • SVG preferred over raster images
  • All objects assigned unique identifiers

3. JSON Descriptions

All tables, figures, and multimedia elements must include structured JSON descriptions proportional to their complexity. These descriptions enable:

  • Text-to-speech systems
  • Knowledge graph extraction
  • Advanced indexing
  • Computational scholarship

4. Object-Level Addressability

Each structural unit of a document must be individually identifiable and referenceable. This allows:

  • Precise quotation
  • Programmatic extraction
  • Version comparison
  • Deep scholarly analysis

Searchability ensures that research is not merely readable — it is computationally interpretable.

Plain text is king

Plain text is the gold standard