QuantLens

Why JSONL

Streamable

Consume one record at a time without loading entire files. Ideal for large‑scale training and ETL.

Integrity & Versioning

Per‑pack JSON Schema, rule sets, SHA‑256 manifests and optional Merkle roots.

Training‑Ready

Stable schema versions and contracts ensure reproducible pipelines and minimal parsing overhead.

Samples

SEC Earnings (JSONL)
sec_earnings_sample.jsonl
3 records • transcripts + metrics
Patent Citations (JSONL)
patent_citations_sample.jsonl
3 records • citation pairs

Read JSONL

Python

import json
with open('sec_earnings_sample.jsonl', 'r', encoding='utf-8') as f:
    for line in f:
        rec = json.loads(line)
        # use rec["transcript"], rec["metrics"], ...

Node.js

import fs from 'node:fs'
import readline from 'node:readline'
const rl = readline.createInterface({ input: fs.createReadStream('sec_earnings_sample.jsonl') })
for await (const line of rl) {
  const rec = JSON.parse(line)
}

Schema & Integrity

Each pack publishes a JSON Schema and rule set; integrity via SHA‑256 per file and optional Merkle manifest.

Schema (SEC Earnings)
schema.json
JSON Schema
Expectations (SEC Earnings)
expectations.json
Rule set