Skip to content

datatractor/yard

Repository files navigation

Datatractor Yard: Metadata Extractor Registry

Documentation Github status

A place to develop and discuss the Datatractor Yard (formerly the MaRDA Extractors WG registry). The idea is to collect various file formats used in materials science and chemistry, describe them with metadata, and provide links to software projects that can parse them.

By providing this data in a web API, it hoped that users can discover new extractors more easily and metadata standards can be developed for the output of extractors to enable schemas to proliferate throughout the field.

The state of the main branch is deployed to https://yard.datatractor.org/, with API docs (and built-in client) accessible at https://yard.datatractor.org/redoc.

For more information, see the paper:

Datatractor: Metadata, automation, and registries for extractor interoperability in the chemical and materials sciences
Matthew L. Evans, Gian-Marco Rignanese, David Elbert & Peter Kraus
MRS Bulletin, 50, pp838–845 (2025) DOI: 10.1557/s43577-025-00925-8
(preprint arXiv:2410.18839)

Contributing

You are welcome to contribute file type and extractor entries to this registry, by opening a pull request. Please see the contributing guidelines for detailed steps. After submitting a pull request, this data will be validated and added to the deployed database once it is merged.

Development

Instructions for developing the registry itself can be found in the development guide.

Registry Maintainers

About

Place for all of your data extractors!

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 7