Command-Line Interface
FairML Datasets provides a command-line interface (CLI) for common operations on fairness datasets. This page documents the available commands and their options.
Overview
The CLI is accessible via the fairml_datasets
module:
Available Commands
python -m fairml_datasets
Command-line interface for the fairml datasets package.
Usage:
Options:
export-citations
Export dataset citations as a .bib file.
This command collects all citations from either all datasets or the specified datasets and exports them to a .bib file, ensuring duplicate citations are only included once.
Usage:
Options:
-o, --output TEXT Output file for the citations in .bib format.
--ids TEXT Comma-separated list of dataset IDs to export citations
for. If not provided, exports citations for all datasets.
--help Show this message and exit.
export-datasets
Export datasets as files.
Usage:
Options:
-s, --stage [downloaded|loaded|prepared|binarized|transformed|split]
At which stage of processing to export the
data.
--id TEXT Export only a single dataset.
--inclue-large-datasets Include large datasets in the export.
--include-usage-info Whether to also export information regarding
the role of different columns e.g. which
ones are features, sensitive and target.
--help Show this message and exit.
metadata
Generate and save metadata for the datasets.
Usage:
Options:
-f, --file TEXT Which file to write the metadata to, the
ending will determine the format (csv and
json supported).
--id TEXT Generate metadata for only a single dataset.
--inclue-large-datasets Include large datasets in the metadata
generation (only used if descriptives are
computed).
--type [annotations|descriptives|all]
Type of metadata to generate (annotations,
descriptives, or both).
--help Show this message and exit.
Examples
Generating Metadata
Generate and save metadata for all datasets:
Export metadata in JSON format:
Generate metadata for a specific dataset:
Exporting Datasets
Export all datasets in prepared format:
Export a specific dataset with train/test/validation splits:
Include usage information:
Exporting Citations
Export citations for all datasets:
Export citations for specific datasets: