Contributing
Thank you for your interest in contributing to the sgb-data-validator project! We welcome contributions from the community.
Please note we have a code of conduct. Please follow it in all your interactions with the project.
Getting Started
Prerequisites
- Python 3.13 or higher
- uv for dependency management
- Git for version control
Setting Up Your Development Environment
Fork the repository on GitHub
Clone your fork locally:
git clone https://github.com/YOUR_USERNAME/sgb-data-validator.git cd sgb-data-validator
Install dependencies using uv:
pip install uv uv sync
Set up your environment variables:
cp example.env .env # Edit .env with your configuration
Verify your setup by running tests:
uv run python -m pytest test/
Development Workflow
Before Starting Work
- Check for existing issues: Search the issue tracker to see if someone is already working on it
- Create or comment on an issue: Discuss your proposed changes before starting work
- Create a feature branch: Use descriptive branch names like
feature/add-validation
orfix/iconclass-bug
Making Changes
- Write clear, focused commits: Each commit should represent a single logical change
- Follow the code style:
- Run
uv run ruff check .
to check for issues - Run
uv run ruff format .
to format code
- Run
- Add tests: Ensure your changes are covered by tests
- Update documentation: Update README.md, docstrings, and other docs as needed
Code Style Guidelines
- Follow PEP 8 conventions
- Use type hints for function parameters and return values
- Write descriptive docstrings for modules, classes, and functions
- Keep functions focused and modular
- Use meaningful variable and function names
Testing
Run the test suite before submitting:
# Run all tests
uv run python -m pytest test/ -v
# Run specific test file
uv run python test/test_validation.py
# Run with coverage
uv run python -m pytest test/ --cov=src --cov-report=html
Running the Linter and Formatter
# Check for style issues
uv run ruff check .
# Auto-format code
uv run ruff format .
# Check and fix in one command
uv run ruff check . --fix
Pull Request Process
Update your branch: Ensure your branch is up to date with the main branch:
git fetch upstream git rebase upstream/main
Run tests and linters: Verify everything passes:
uv run ruff check . uv run ruff format . uv run python -m pytest test/
Update documentation:
- Update README.md if you’ve changed functionality
- Update CHANGELOG.md with a brief description of your changes
- Update docstrings and code comments
Create a pull request:
- Write a clear title summarizing the change
- Provide a detailed description of what changed and why
- Reference any related issues (e.g., “Fixes #123”)
- Include examples or screenshots if applicable
Respond to feedback: Be responsive to review comments and make requested changes promptly
Versioning: We use SemVer for versioning. Maintainers will handle version bumps during the release process.
Repository Structure
Understanding the repository layout:
sgb-data-validator/
├── src/ # Source code
│ ├── models.py # Pydantic data models
│ ├── vocabularies.py # Controlled vocabulary loader
│ ├── iconclass.py # Iconclass notation validator
│ ├── profiling.py # Data profiling utilities
│ └── api.py # API client
├── test/ # Test suite
│ ├── test_validation.py # Validation tests
│ ├── test_iconclass.py # Iconclass tests
│ └── ... # Other test files
├── data/ # Data files
│ └── raw/ # Raw input data
│ └── vocabularies.json # Controlled vocabularies
├── examples/ # Usage examples
│ ├── api_usage.py # API examples
│ └── iconclass_usage.py # Iconclass examples
├── validate.py # Main validation script
├── main.py # Alternative entry point
├── README.md # Main documentation
├── IMPLEMENTATION.md # Implementation details
├── CONTRIBUTING.md # This file
└── pyproject.toml # Project dependencies
Types of Contributions
Reporting Bugs
- Use the issue tracker
- Include detailed steps to reproduce
- Provide error messages, logs, and system information
- Mention the version you’re using
Suggesting Enhancements
- Check if the feature has already been suggested
- Clearly describe the feature and its use case
- Explain why it would be useful to the project
- Provide examples or mockups if applicable
Improving Documentation
- Fix typos and clarify unclear sections
- Add examples and tutorials
- Improve code comments and docstrings
- Keep documentation in sync with code changes
Writing Code
- Bug fixes
- New features
- Performance improvements
- Test coverage improvements
- Code refactoring
Commit Message Guidelines
Write clear, concise commit messages:
- Use the imperative mood (“Add feature” not “Added feature”)
- Keep the first line under 72 characters
- Reference issues and pull requests when applicable
- Provide additional context in the commit body if needed
Examples:
Add ISO 639-1 language validation
Implements validation of language codes against the ISO 639-1 standard.
Also updates the README with usage examples.
Fixes #42
Questions or Need Help?
- Open an issue for questions
- Tag maintainers (@maehr) if you need guidance
- Be patient and respectful when seeking help
License
By contributing, you agree that your contributions will be licensed under the same licenses as the project: