API Reference
Python API Reference
Reference for Battery Data Standard Python APIs for reading, converting, validating, auditing, diagnostics, reports, and export targets.
Python API Reference
Stable high-level entry points are exported from battery_data_standard and from
the short alias package bds.
import battery_data_standard as bds
# or
import bdsThe public API returns Polars dataframes and report objects. Report objects
provide to_dict() and to_json() methods for serialization.
Time-Series Conversion
read
read(
path,
cycler=None,
profile=None,
strict=True,
keep_raw=False,
current_sign="charge-positive",
repair_policy="warn",
time_sampling_policy="repair",
time_sampling_interval_s=None,
time_sampling_interpolation="linear",
time_sampling_tolerance=0.1,
detection_threshold=0.1,
current_sign_check="none",
sheet=None,
)Reads a supported cycler export into a normalized Polars dataframe.
Use cycler="auto" or cycler=None for automatic detection. Use an explicit
cycler id such as neware, arbin, maccor, biologic, novonix,
basytec, landt, or generic when the source system is known.
read() returns a lower-level normalized dataframe for package internals and
advanced users. For public BDS handoff, display, or saved files, convert it to
the export template:
from battery_data_standard.export import to_export_frame
export_df = to_export_frame(df)For real experimental datasets, preserve the raw instrument current sign and repair documented time-axis issues with:
df = bds.read(
path,
cycler="auto",
current_sign="preserve",
repair_policy="repair",
)current_sign="preserve" keeps the source file's current sign convention.
repair_policy="repair" applies documented repairs such as shifting elapsed
test time to start at zero.
time_sampling_policy="repair" checks regular test_time_s sampling and
inserts missing samples only when a fixed interval is detected or supplied. Use
time_sampling_policy="warn" to report missing time points without insertion,
or set time_sampling_interval_s=1, 2, 10, or another protocol interval.
The default interpolation method is time_sampling_interpolation="linear".
current_sign_check="adjacent" runs an optional conservative O(n) sanity check that
compares adjacent voltage changes with the current direction when
current_sign is charge-positive or discharge-positive. Use
current_sign_check="none" to skip this check. The default is none so large
files and heuristic-free workflows do not pay the extra scan cost unless they
ask for it.
read_with_report
read_with_report(path, ...)Returns (dataframe, ConversionReport). This is the recommended entry point for
automated pipelines that need conversion warnings, provenance, adapter metadata,
and validation details.
convert
convert(
input_path,
output_path,
format="csv",
cycler=None,
profile=None,
metadata=None,
strict=True,
keep_raw=False,
current_sign="charge-positive",
repair_policy="warn",
time_sampling_policy="repair",
time_sampling_interval_s=None,
time_sampling_interpolation="linear",
time_sampling_tolerance=0.1,
detection_threshold=0.1,
current_sign_check="none",
report_path=None,
report_formats=None,
write_sidecars=False,
sheet=None,
target="bds",
)Converts a supported time-series export and writes CSV or Parquet output. The
function returns ConversionReport.
format must be csv or parquet. Use report_path="auto" for the standard
user workflow: the converted data file is written together with JSON and PDF
reports named from the output stem, for example
normalized.bds.report.json and normalized.bds.report.pdf.
Use report_formats=("html", "xlsx") with report_path="auto" to add review
formats while keeping the default JSON and PDF reports. Passing a single report
filename writes the format implied by the suffix, such as report.json,
report.html, report.xlsx, or report.pdf. If write_sidecars=True, report
and metadata sidecars are written next to the output.
target selects the output schema preset. The default bds target writes the
standard normalized table. Other targets write staging tables for downstream
tools:
| Target | Output shape |
|---|---|
bds |
Standard BDS export columns such as Test Time (s), Voltage (V), and Current (A). |
bdf |
Legacy BDF-style export with slash-unit column names; not a formal conformance certificate. |
duckdb |
Same standard table, recommended with Parquet. |
polars |
Same standard table, recommended with Parquet. |
battery-archive |
Same standard table for archive-style packaging, recommended with Parquet. |
cellpy |
cellpy-like lower-case staging columns. |
beep |
BEEP-like lower-case staging columns. |
pybamm |
Drive-cycle staging table with time_s and current_a. |
pyprobe |
Diagnostic staging table with time_s, voltage_v, current_a, and optional cycle/step fields. |
Available targets are discoverable with:
bds.list_export_targets()EIS Conversion
read_eis
read_eis(path, sheet=None)Reads an EIS file into the standardized EIS table.
convert_eis
convert_eis(input_path, output_path, format="csv", sheet=None)Converts an EIS file and writes standardized CSV or Parquet output.
validate_eis
validate_eis(dataframe)Validates an in-memory standardized EIS dataframe and returns
ValidationReport.
Batch Conversion
batch_convert
batch_convert(
input_dir,
output_dir,
recursive=False,
manifest_path=None,
fail_fast=False,
format="csv",
cycler="auto",
profile=None,
metadata=None,
strict=True,
keep_raw=False,
current_sign="charge-positive",
repair_policy="warn",
time_sampling_policy="repair",
time_sampling_interval_s=None,
time_sampling_interpolation="linear",
time_sampling_tolerance=0.1,
detection_threshold=0.1,
current_sign_check="none",
write_sidecars=False,
sheet=None,
excel_sheets="auto",
target="bds",
)Converts a directory, a single file, or a supported archive. The function returns
a list of per-file records. If manifest_path is provided, records are also
written as JSONL.
excel_sheets controls workbook handling in batch mode:
| Value | Behavior |
|---|---|
auto |
Let the adapter select the relevant sheet or sheet group. |
first |
Process only the first workbook sheet. |
all |
Process each workbook sheet independently. |
name |
Process the sheet passed with sheet; sheet is required. |
Archives are expanded into temporary storage. Supported archive suffixes are
.zip, .tar, .tar.gz, and .tgz.
Intake Audit
doctor
report = bds.doctor("raw_export.csv", cycler="auto")Returns DoctorReport for one file without writing converted data. The report
focuses on troubleshooting: data kind, adapter candidates, selected adapter,
missing required columns, validation issues, warnings, unmapped columns,
suspicious headers, suggested next steps, and the minimum anonymized fixture
checklist.
The equivalent CLI is:
bds doctor raw_export.csv
bds doctor raw_export.csv --jsonexplain
report = bds.explain(
"raw_export.csv",
cycler="auto",
current_sign="preserve",
repair_policy="warn",
)Returns ExplainReport for one file without writing converted data. The report
includes data-kind detection, adapter candidates, selected adapter, confidence,
sheet, source columns, canonical/export column mapping, unit transforms,
current-sign evidence, repair policy, validation issues, warnings, unmapped
columns, and a recommended next action.
The equivalent CLI is:
bds explain raw_export.csv --text
bds explain raw_export.csv --json report.json --html report.html --xlsx report.xlsxPython callers can write the same formatted diagnostic reports:
bds.write_explain_reports(report, "reports", formats=("json", "html", "xlsx"))PDF output is also supported.
audit
from battery_data_standard.audit import audit
report = audit(
"raw_exports",
recursive=True,
json_path="audit.json",
html_path="audit.html",
)Audits raw files without writing converted data outputs. The report includes file-level status, data kind, cycler, detection confidence, quality score, quality grade, missing required columns, unit conversions, repair operations, current-sign evidence, duplicate timestamps, non-monotonic time, suspicious flat voltage/current checks, cycle/step anomaly checks, and errors.
Directory audit skips obvious helper files such as README files, manifests,
metadata/report sidecars, labels, summaries, and procedure files. Optional
columns are reported under completeness; missing optional columns are not
quality-score penalties.
The equivalent CLI is:
bds audit raw_exports --recursive --json audit.json --html audit.htmlaudit_file
from battery_data_standard.audit import audit_file
record = audit_file("raw_export.csv")Returns one AuditRecord for a single file.
Detection and Metadata
detect
detect(path)Returns DetectionResult with the selected cycler, confidence score, reason,
candidate list, and path.
detect_kind
detect_kind(path, sheet=None)Returns DataKindResult for operational routing. Possible kinds include
timeseries, eis, unsupported, and unknown.
list_supported_formats
list_supported_formats()Returns adapter metadata including cycler id, display name, support tier, evidence tier, extensions, unsupported extensions, and adapter version.
group_neware_files
group_neware_files(paths)Groups NEWARE record exports by file content when a single test is split across multiple files.
convert_neware_groups
convert_neware_groups(paths, output_dir, ...)Converts grouped NEWARE record exports into one output per grouped test.
Validation
validate
validate(dataframe, schema_version=..., strict=True)Validates an in-memory normalized dataframe and returns ValidationReport.
validate_file
from battery_data_standard.api import validate_file
validate_file(path, schema_version=..., strict=True)Validates an existing normalized CSV, Excel, or Parquet file on disk. This helper
is available from battery_data_standard.api.
Reports
ConversionReport
ConversionReport includes:
input_pathandoutput_path;cycler,adapter_version,support_tier,evidence_tier, anddetection_confidence;schema_version,rows, andcolumns;validation, aValidationReport;warnings,provenance, andmetadata;- source details such as
encoding,delimiter,header_row,sheet_name, andraw_rows; repair_operationsandunmapped_columns;current_sign.
Time-sampling findings are stored in metadata["time_sampling"] when the
time-series path is used. The record includes policy, expected interval,
interval confidence, missing sample count, gap locations, interpolation method,
and inserted row count when repair is applied.
Current-sign and step/cycle sanity findings are stored in
metadata["current_sign_sanity"], metadata["current_sign_confidence"],
metadata["semantic_sources"], and metadata["step_cycle_semantics"].
Temperature semantic findings are stored in metadata["temperature_semantics"]
and metadata["temperature_semantics_confidence"]. These
records are conservative diagnostics; they warn about trust-affecting ambiguity
but do not automatically change scientific semantics.
ValidationReport
ValidationReport includes:
valid;schema_version;rows;columns;issues.
Each issue includes level, code, message, and an optional column.
Production pipelines should branch on valid and issue code values rather
than parsing free-text messages.
DetectionResult
DetectionResult includes:
cycler;confidence;reason;candidates;path.
ExplainReport
ExplainReport includes:
status;data_kind;detection;selected_adapterandconfidence;sheet;source_columns,canonical_columns, andexport_columns;column_mappingandunit_transforms;current_signandcurrent_sign_evidence;repair_policy;validation,warnings,unmapped_columns, andtime_sampling;recommended_next_action;error_typeanderrorwhen diagnostics cannot complete conversion.
Batch Records
batch_convert() and bds batch use these record semantics:
| Status | Record type | Meaning |
|---|---|---|
ok |
converted |
A time-series or EIS file was converted. |
unsupported |
skipped |
The file was identified as unsupported or non-raw helper content. |
error |
error |
Conversion was attempted and failed. |
Common fields include:
| Field | Meaning |
|---|---|
input_path |
Path used by the converter. For archive members this is the temporary extracted path. |
output_path |
Written output path for converted records; null for skipped records. |
archive_path |
Source archive path when the record came from an archive. |
archive_member |
Original member name inside an archive. |
sheet_name |
Workbook sheet used for the record, or null. |
data_kind |
Detected operational kind. |
kind_confidence |
Confidence score from detect_kind(). |
kind_reason |
Human-readable reason from detect_kind(). |
record_type |
converted, skipped, or error. |
Converted time-series records include serialized ConversionReport fields. EIS records
include validation details, row count, and columns. Skipped records include
skip_reason. Error records include error_type and error.
Exceptions
Public API callers should catch BatteryDataStandardError or one of its
subclasses:
DetectionErrorAmbiguousDetectionErrorUnsupportedFormatErrorUnsupportedFeatureErrorConversionErrorFileIOErrorValidationFailed
ValidationFailed carries the validation report that caused strict validation to
fail.