BDS

API Reference

Python API Reference

Reference for Battery Data Standard Python APIs for reading, converting, validating, auditing, diagnostics, reports, and export targets.

Python API Reference

Stable high-level entry points are exported from battery_data_standard and from the short alias package bds.

import battery_data_standard as bds
# or
import bds

The public API returns Polars dataframes and report objects. Report objects provide to_dict() and to_json() methods for serialization.

Time-Series Conversion

read

read(
    path,
    cycler=None,
    profile=None,
    strict=True,
    keep_raw=False,
    current_sign="charge-positive",
    repair_policy="warn",
    time_sampling_policy="repair",
    time_sampling_interval_s=None,
    time_sampling_interpolation="linear",
    time_sampling_tolerance=0.1,
    detection_threshold=0.1,
    current_sign_check="none",
    sheet=None,
)

Reads a supported cycler export into a normalized Polars dataframe.

Use cycler="auto" or cycler=None for automatic detection. Use an explicit cycler id such as neware, arbin, maccor, biologic, novonix, basytec, landt, or generic when the source system is known.

read() returns a lower-level normalized dataframe for package internals and advanced users. For public BDS handoff, display, or saved files, convert it to the export template:

from battery_data_standard.export import to_export_frame
 
export_df = to_export_frame(df)

For real experimental datasets, preserve the raw instrument current sign and repair documented time-axis issues with:

df = bds.read(
    path,
    cycler="auto",
    current_sign="preserve",
    repair_policy="repair",
)

current_sign="preserve" keeps the source file's current sign convention. repair_policy="repair" applies documented repairs such as shifting elapsed test time to start at zero.

time_sampling_policy="repair" checks regular test_time_s sampling and inserts missing samples only when a fixed interval is detected or supplied. Use time_sampling_policy="warn" to report missing time points without insertion, or set time_sampling_interval_s=1, 2, 10, or another protocol interval. The default interpolation method is time_sampling_interpolation="linear".

current_sign_check="adjacent" runs an optional conservative O(n) sanity check that compares adjacent voltage changes with the current direction when current_sign is charge-positive or discharge-positive. Use current_sign_check="none" to skip this check. The default is none so large files and heuristic-free workflows do not pay the extra scan cost unless they ask for it.

read_with_report

read_with_report(path, ...)

Returns (dataframe, ConversionReport). This is the recommended entry point for automated pipelines that need conversion warnings, provenance, adapter metadata, and validation details.

convert

convert(
    input_path,
    output_path,
    format="csv",
    cycler=None,
    profile=None,
    metadata=None,
    strict=True,
    keep_raw=False,
    current_sign="charge-positive",
    repair_policy="warn",
    time_sampling_policy="repair",
    time_sampling_interval_s=None,
    time_sampling_interpolation="linear",
    time_sampling_tolerance=0.1,
    detection_threshold=0.1,
    current_sign_check="none",
    report_path=None,
    report_formats=None,
    write_sidecars=False,
    sheet=None,
    target="bds",
)

Converts a supported time-series export and writes CSV or Parquet output. The function returns ConversionReport.

format must be csv or parquet. Use report_path="auto" for the standard user workflow: the converted data file is written together with JSON and PDF reports named from the output stem, for example normalized.bds.report.json and normalized.bds.report.pdf.

Use report_formats=("html", "xlsx") with report_path="auto" to add review formats while keeping the default JSON and PDF reports. Passing a single report filename writes the format implied by the suffix, such as report.json, report.html, report.xlsx, or report.pdf. If write_sidecars=True, report and metadata sidecars are written next to the output.

target selects the output schema preset. The default bds target writes the standard normalized table. Other targets write staging tables for downstream tools:

Target Output shape
bds Standard BDS export columns such as Test Time (s), Voltage (V), and Current (A).
bdf Legacy BDF-style export with slash-unit column names; not a formal conformance certificate.
duckdb Same standard table, recommended with Parquet.
polars Same standard table, recommended with Parquet.
battery-archive Same standard table for archive-style packaging, recommended with Parquet.
cellpy cellpy-like lower-case staging columns.
beep BEEP-like lower-case staging columns.
pybamm Drive-cycle staging table with time_s and current_a.
pyprobe Diagnostic staging table with time_s, voltage_v, current_a, and optional cycle/step fields.

Available targets are discoverable with:

bds.list_export_targets()

EIS Conversion

read_eis

read_eis(path, sheet=None)

Reads an EIS file into the standardized EIS table.

convert_eis

convert_eis(input_path, output_path, format="csv", sheet=None)

Converts an EIS file and writes standardized CSV or Parquet output.

validate_eis

validate_eis(dataframe)

Validates an in-memory standardized EIS dataframe and returns ValidationReport.

Batch Conversion

batch_convert

batch_convert(
    input_dir,
    output_dir,
    recursive=False,
    manifest_path=None,
    fail_fast=False,
    format="csv",
    cycler="auto",
    profile=None,
    metadata=None,
    strict=True,
    keep_raw=False,
    current_sign="charge-positive",
    repair_policy="warn",
    time_sampling_policy="repair",
    time_sampling_interval_s=None,
    time_sampling_interpolation="linear",
    time_sampling_tolerance=0.1,
    detection_threshold=0.1,
    current_sign_check="none",
    write_sidecars=False,
    sheet=None,
    excel_sheets="auto",
    target="bds",
)

Converts a directory, a single file, or a supported archive. The function returns a list of per-file records. If manifest_path is provided, records are also written as JSONL.

excel_sheets controls workbook handling in batch mode:

Value Behavior
auto Let the adapter select the relevant sheet or sheet group.
first Process only the first workbook sheet.
all Process each workbook sheet independently.
name Process the sheet passed with sheet; sheet is required.

Archives are expanded into temporary storage. Supported archive suffixes are .zip, .tar, .tar.gz, and .tgz.

Intake Audit

doctor

report = bds.doctor("raw_export.csv", cycler="auto")

Returns DoctorReport for one file without writing converted data. The report focuses on troubleshooting: data kind, adapter candidates, selected adapter, missing required columns, validation issues, warnings, unmapped columns, suspicious headers, suggested next steps, and the minimum anonymized fixture checklist.

The equivalent CLI is:

bds doctor raw_export.csv
bds doctor raw_export.csv --json

explain

report = bds.explain(
    "raw_export.csv",
    cycler="auto",
    current_sign="preserve",
    repair_policy="warn",
)

Returns ExplainReport for one file without writing converted data. The report includes data-kind detection, adapter candidates, selected adapter, confidence, sheet, source columns, canonical/export column mapping, unit transforms, current-sign evidence, repair policy, validation issues, warnings, unmapped columns, and a recommended next action.

The equivalent CLI is:

bds explain raw_export.csv --text
bds explain raw_export.csv --json report.json --html report.html --xlsx report.xlsx

Python callers can write the same formatted diagnostic reports:

bds.write_explain_reports(report, "reports", formats=("json", "html", "xlsx"))

PDF output is also supported.

audit

from battery_data_standard.audit import audit
 
report = audit(
    "raw_exports",
    recursive=True,
    json_path="audit.json",
    html_path="audit.html",
)

Audits raw files without writing converted data outputs. The report includes file-level status, data kind, cycler, detection confidence, quality score, quality grade, missing required columns, unit conversions, repair operations, current-sign evidence, duplicate timestamps, non-monotonic time, suspicious flat voltage/current checks, cycle/step anomaly checks, and errors.

Directory audit skips obvious helper files such as README files, manifests, metadata/report sidecars, labels, summaries, and procedure files. Optional columns are reported under completeness; missing optional columns are not quality-score penalties.

The equivalent CLI is:

bds audit raw_exports --recursive --json audit.json --html audit.html

audit_file

from battery_data_standard.audit import audit_file
 
record = audit_file("raw_export.csv")

Returns one AuditRecord for a single file.

Detection and Metadata

detect

detect(path)

Returns DetectionResult with the selected cycler, confidence score, reason, candidate list, and path.

detect_kind

detect_kind(path, sheet=None)

Returns DataKindResult for operational routing. Possible kinds include timeseries, eis, unsupported, and unknown.

list_supported_formats

list_supported_formats()

Returns adapter metadata including cycler id, display name, support tier, evidence tier, extensions, unsupported extensions, and adapter version.

group_neware_files

group_neware_files(paths)

Groups NEWARE record exports by file content when a single test is split across multiple files.

convert_neware_groups

convert_neware_groups(paths, output_dir, ...)

Converts grouped NEWARE record exports into one output per grouped test.

Validation

validate

validate(dataframe, schema_version=..., strict=True)

Validates an in-memory normalized dataframe and returns ValidationReport.

validate_file

from battery_data_standard.api import validate_file
 
validate_file(path, schema_version=..., strict=True)

Validates an existing normalized CSV, Excel, or Parquet file on disk. This helper is available from battery_data_standard.api.

Reports

ConversionReport

ConversionReport includes:

  • input_path and output_path;
  • cycler, adapter_version, support_tier, evidence_tier, and detection_confidence;
  • schema_version, rows, and columns;
  • validation, a ValidationReport;
  • warnings, provenance, and metadata;
  • source details such as encoding, delimiter, header_row, sheet_name, and raw_rows;
  • repair_operations and unmapped_columns;
  • current_sign.

Time-sampling findings are stored in metadata["time_sampling"] when the time-series path is used. The record includes policy, expected interval, interval confidence, missing sample count, gap locations, interpolation method, and inserted row count when repair is applied.

Current-sign and step/cycle sanity findings are stored in metadata["current_sign_sanity"], metadata["current_sign_confidence"], metadata["semantic_sources"], and metadata["step_cycle_semantics"]. Temperature semantic findings are stored in metadata["temperature_semantics"] and metadata["temperature_semantics_confidence"]. These records are conservative diagnostics; they warn about trust-affecting ambiguity but do not automatically change scientific semantics.

ValidationReport

ValidationReport includes:

  • valid;
  • schema_version;
  • rows;
  • columns;
  • issues.

Each issue includes level, code, message, and an optional column. Production pipelines should branch on valid and issue code values rather than parsing free-text messages.

DetectionResult

DetectionResult includes:

  • cycler;
  • confidence;
  • reason;
  • candidates;
  • path.

ExplainReport

ExplainReport includes:

  • status;
  • data_kind;
  • detection;
  • selected_adapter and confidence;
  • sheet;
  • source_columns, canonical_columns, and export_columns;
  • column_mapping and unit_transforms;
  • current_sign and current_sign_evidence;
  • repair_policy;
  • validation, warnings, unmapped_columns, and time_sampling;
  • recommended_next_action;
  • error_type and error when diagnostics cannot complete conversion.

Batch Records

batch_convert() and bds batch use these record semantics:

Status Record type Meaning
ok converted A time-series or EIS file was converted.
unsupported skipped The file was identified as unsupported or non-raw helper content.
error error Conversion was attempted and failed.

Common fields include:

Field Meaning
input_path Path used by the converter. For archive members this is the temporary extracted path.
output_path Written output path for converted records; null for skipped records.
archive_path Source archive path when the record came from an archive.
archive_member Original member name inside an archive.
sheet_name Workbook sheet used for the record, or null.
data_kind Detected operational kind.
kind_confidence Confidence score from detect_kind().
kind_reason Human-readable reason from detect_kind().
record_type converted, skipped, or error.

Converted time-series records include serialized ConversionReport fields. EIS records include validation details, row count, and columns. Skipped records include skip_reason. Error records include error_type and error.

Exceptions

Public API callers should catch BatteryDataStandardError or one of its subclasses:

  • DetectionError
  • AmbiguousDetectionError
  • UnsupportedFormatError
  • UnsupportedFeatureError
  • ConversionError
  • FileIOError
  • ValidationFailed

ValidationFailed carries the validation report that caused strict validation to fail.