Contributing to Kakashi
๐ Welcome Contributors!โ
Thank you for your interest in contributing to Kakashi! This guide will help you understand the codebase, development workflow, and how to make meaningful contributions.
๐๏ธ Understanding the Codebaseโ
Project Structureโ
kakashi/
โโโ kakashi/ # Main package
โ โโโ __init__.py # Public API exports
โ โโโ core/ # Core implementation
โ โโโ logger.py # Main Logger and AsyncLogger classes
โ โโโ records.py # LogRecord, LogContext, LogLevel
โ โโโ config.py # Configuration system
โ โโโ pipeline.py # Pipeline processing components
โโโ performance_tests/ # Performance validation suite
โ โโโ validate_performance.py
โโโ documentation/ # Docusaurus documentation site
โโโ tests/ # Test suite
โโโ pyproject.toml # Package configuration
โโโ README.md # Project overview
Core Architecture Overviewโ
Kakashi is built around these key principles:
- Performance-First: Every design decision prioritizes performance
- Thread Safety: Zero contention through thread-local storage
- Memory Efficiency: Minimal allocations and buffer pooling
- Clean API: Simple, intuitive interface for developers
๐ง Development Setupโ
Prerequisitesโ
- Python 3.7+
- Git
- Virtual environment tool (venv, virtualenv, or conda)
Local Developmentโ
-
Clone the repository:
git clone https://github.com/IntegerAlex/kakashi.git
cd kakashi -
Create virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate -
Install development dependencies:
pip install -e .[dev]
๐งช Testing Strategyโ
Test Categoriesโ
Unit Testsโ
Test individual components in isolation:
import pytest
from kakashi.core.logger import Logger
class TestLogger:
def test_logger_creation(self):
"""Test basic logger creation."""
logger = Logger("test.logger")
assert logger.name == "test.logger"
assert logger._min_level == 20 # INFO level
def test_level_filtering(self):
"""Test level filtering works correctly."""
logger = Logger("test", min_level=30) # WARNING and above
# DEBUG should be filtered out
logger._log(10, "debug message") # No output expected
# WARNING should pass through
logger._log(30, "warning message") # Should be processed
Performance Testsโ
Test performance characteristics:
import pytest
import time
from concurrent.futures import ThreadPoolExecutor
@pytest.mark.performance
class TestLoggerPerformance:
def test_single_thread_throughput(self):
"""Test single-thread logging throughput."""
logger = Logger("perf_test")
# Warm up
for _ in range(1000):
logger._log(20, "warm up message")
# Benchmark
start_time = time.time()
num_logs = 100000
for i in range(num_logs):
logger._log(20, f"benchmark message {i}")
elapsed = time.time() - start_time
throughput = num_logs / elapsed
# Assert minimum throughput
assert throughput > 50000, f"Throughput {throughput:.0f} logs/sec too low"
๐ Contributing Workflowโ
Before Startingโ
- Open an issue to discuss the feature or bug fix
- Check existing code for similar functionality
- Review architecture to understand design patterns
- Write tests first (TDD approach recommended)
Feature Development Processโ
-
Create feature branch:
git checkout -b feature/my-feature -
Write failing tests for the new functionality
-
Implement the feature following existing patterns
-
Make tests pass with minimal code changes
-
Refactor and optimize while keeping tests green
-
Add documentation and examples
-
Submit pull request with clear description
๐ Code Style & Standardsโ
Python Style Guideโ
- PEP 8 compliance enforced by
blackandflake8 - Type hints required for all public APIs
- Docstrings required for all public functions and classes
- Line length: 88 characters (black default)
Code Formattingโ
def create_high_performance_logger(
name: str,
min_level: int = 20,
batch_size: int = 100
) -> Logger:
"""Create a high-performance logger instance.
Args:
name: Logger name (typically __name__)
min_level: Minimum log level (default: INFO)
batch_size: Batch size for I/O optimization
Returns:
Configured Logger instance
"""
return Logger(name, min_level, batch_size)
๐ Debugging & Profilingโ
Performance Profilingโ
import cProfile
import pstats
from kakashi.core.logger import Logger
def profile_logging_performance():
"""Profile logging performance."""
logger = Logger("profile_test")
# Profile logging operations
profiler = cProfile.Profile()
profiler.enable()
# Perform logging operations
for i in range(10000):
logger._log(20, f"test message {i}")
profiler.disable()
# Analyze results
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10) # Top 10 functions
Memory Profilingโ
import tracemalloc
import gc
from kakashi.core.logger import Logger
def profile_memory_usage():
"""Profile memory usage during logging."""
tracemalloc.start()
logger = Logger("memory_test")
# Take snapshot before logging
snapshot1 = tracemalloc.take_snapshot()
# Perform logging operations
for i in range(1000):
logger._log(20, f"memory test message {i}")
# Force garbage collection
gc.collect()
# Take snapshot after logging
snapshot2 = tracemalloc.take_snapshot()
# Analyze memory usage
top_stats = snapshot2.compare_to(snapshot1, 'lineno')
print("Top memory allocations:")
for stat in top_stats[:10]:
print(stat)
๐ฏ Common Contribution Areasโ
Performance Improvementsโ
-
Hot Path Optimization
- Reduce CPU cycles in critical paths
- Optimize memory allocations
- Improve cache locality
-
Concurrency Enhancements
- Better thread scaling
- Lock-free algorithms
- Improved batch processing
-
Memory Optimization
- Buffer pooling strategies
- Object reuse patterns
- Garbage collection optimization
Feature Additionsโ
-
New Output Formats
- JSON formatters
- Custom serialization
- Template-based formatting
-
Additional Sinks
- Network logging
- Database logging
- Cloud service integration
๐ Pull Request Guidelinesโ
PR Checklistโ
- Tests pass: All existing tests continue to pass
- New tests: New functionality has comprehensive tests
- Documentation: Public APIs are documented
- Type hints: All new code has proper type annotations
- Performance: No significant performance regressions
- Backwards compatibility: Changes don't break existing APIs
PR Description Templateโ
## Summary
Brief description of the changes.
## Changes
- List of specific changes made
- Include any breaking changes
## Testing
- Description of tests added
- Performance impact (if any)
## Documentation
- Link to updated documentation
- Examples of new functionality
## Checklist
- [ ] Tests pass
- [ ] Documentation updated
- [ ] No breaking changes (or clearly documented)
- [ ] Performance tested
๐ Release Processโ
Version Numberingโ
Kakashi follows Semantic Versioning:
- MAJOR: Breaking changes
- MINOR: New features, backwards compatible
- PATCH: Bug fixes, backwards compatible
Release Checklistโ
- Update version in
pyproject.tomland__init__.py - Update CHANGELOG.md with release notes
- Run full test suite including performance tests
- Build and test package:
python -m build && pip install dist/*.whl - Create release tag:
git tag v0.2.0 - Push to PyPI:
twine upload dist/* - Create GitHub release with release notes
๐ค Communityโ
Getting Helpโ
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: Questions and general discussion
- Email: Direct contact for security issues
Code of Conductโ
We follow the Contributor Covenant code of conduct. Please be respectful and inclusive in all interactions.
๐ง Development Tools & Workflowโ
Pre-commit Hooksโ
Install pre-commit hooks to ensure code quality:
# Install pre-commit
pip install pre-commit
# Install git hooks
pre-commit install
# Run on all files
pre-commit run --all-files
Code Quality Toolsโ
# Format code with black
black kakashi/ tests/
# Check code style with flake8
flake8 kakashi/ tests/
# Run type checking with mypy
mypy kakashi/
# Run security checks with bandit
bandit -r kakashi/
Continuous Integrationโ
Our CI pipeline runs on every PR and includes:
- Unit Tests: pytest with coverage reporting
- Performance Tests: Automated performance regression detection
- Code Quality: black, flake8, mypy, bandit
- Documentation: Build and validate docs
- Package Build: Test package installation
๐ Performance Testing Guidelinesโ
Benchmarking Standardsโ
When contributing performance improvements:
- Baseline Measurement: Always measure current performance first
- Statistical Significance: Run benchmarks multiple times (min 5 runs)
- Environment Consistency: Use same hardware/OS for comparisons
- Memory Profiling: Include memory usage in performance analysis
Performance Test Examplesโ
import pytest
import time
import statistics
from kakashi.core.logger import Logger
class TestLoggerPerformance:
def test_logging_throughput_benchmark(self):
"""Benchmark logging throughput with statistical analysis."""
logger = Logger("throughput_test")
num_logs = 100000
run_times = []
# Multiple benchmark runs
for run in range(5):
start_time = time.perf_counter()
for i in range(num_logs):
logger._log(20, f"benchmark message {i}")
end_time = time.perf_counter()
run_times.append(end_time - start_time)
# Calculate statistics
mean_time = statistics.mean(run_times)
std_dev = statistics.stdev(run_times)
throughput = num_logs / mean_time
# Performance assertions
assert throughput > 50000, f"Throughput {throughput:.0f} logs/sec below threshold"
assert std_dev / mean_time < 0.1, "Performance too variable"
print(f"Throughput: {throughput:.0f} logs/sec")
print(f"Mean time: {mean_time:.4f}s ยฑ {std_dev:.4f}s")
๐ Debugging Common Issuesโ
Common Development Problemsโ
1. Import Errorsโ
# If you get import errors, ensure you're in the right environment
source venv/bin/activate
pip install -e .
# Check PYTHONPATH
echo $PYTHONPATH
2. Test Failuresโ
# Run specific test with verbose output
pytest tests/test_logger.py::TestLogger::test_logger_creation -v -s
# Run with coverage
pytest --cov=kakashi tests/
# Debug specific test
pytest tests/test_logger.py::TestLogger::test_logger_creation --pdb
3. Performance Regressionsโ
# Run performance tests only
pytest -m performance
# Compare with previous results
python performance_tests/validate_performance.py --compare-baseline
Debugging Tipsโ
- Use logging: Add debug logs to understand execution flow
- Profile locally: Use cProfile and memory_profiler for performance issues
- Check dependencies: Ensure all dependencies are correctly installed
- Verify environment: Check Python version and virtual environment
๐ Documentation Standardsโ
Docstring Formatโ
Follow Google-style docstrings:
def process_log_record(record: LogRecord, context: LogContext) -> str:
"""Process a log record with context information.
Args:
record: The log record to process
context: Additional context information
Returns:
Formatted log message string
Raises:
ValueError: If record is invalid
TypeError: If context is wrong type
Example:
>>> record = LogRecord(level=20, message="test")
>>> context = LogContext(user_id="123")
>>> process_log_record(record, context)
'test [user_id=123]'
"""
if not record:
raise ValueError("Record cannot be None")
# Implementation here
return formatted_message
API Documentationโ
- Public APIs: Must have comprehensive docstrings
- Examples: Include usage examples for complex functions
- Type hints: All parameters and return values must be typed
- Error handling: Document all possible exceptions
๐ Advanced Development Topicsโ
Memory Managementโ
Understanding Kakashi's memory model:
class MemoryOptimizedLogger:
"""Example of memory optimization patterns used in Kakashi."""
def __init__(self):
# Object pooling for frequently allocated objects
self._buffer_pool = []
self._max_pool_size = 100
def _get_buffer(self):
"""Get buffer from pool or create new one."""
if self._buffer_pool:
return self._buffer_pool.pop()
return bytearray(1024)
def _return_buffer(self, buffer):
"""Return buffer to pool for reuse."""
if len(self._buffer_pool) < self._max_pool_size:
buffer.clear() # Reset buffer
self._buffer_pool.append(buffer)
Thread Safety Patternsโ
import threading
from contextlib import contextmanager
class ThreadSafeComponent:
"""Example of thread safety patterns in Kakashi."""
def __init__(self):
self._local = threading.local()
self._lock = threading.RLock()
@contextmanager
def _thread_context(self):
"""Manage thread-local context safely."""
if not hasattr(self._local, 'context'):
with self._lock:
if not hasattr(self._local, 'context'):
self._local.context = {}
yield self._local.context
๐ Security Considerationsโ
Input Validationโ
import re
from typing import Optional
def validate_log_message(message: str) -> Optional[str]:
"""Validate and sanitize log message input.
Args:
message: Raw log message
Returns:
Sanitized message or None if invalid
"""
if not isinstance(message, str):
return None
# Remove potential injection patterns
sanitized = re.sub(r'[<>"\']', '', message)
# Limit message length
if len(sanitized) > 10000:
return None
return sanitized
Secure Configurationโ
import os
from pathlib import Path
def load_secure_config(config_path: str) -> dict:
"""Load configuration with security checks."""
path = Path(config_path)
# Security checks
if not path.is_file():
raise FileNotFoundError(f"Config file not found: {config_path}")
# Prevent path traversal
if '..' in str(path.resolve()):
raise ValueError("Invalid config path")
# Check file permissions
if path.stat().st_mode & 0o777 != 0o600:
raise PermissionError("Config file has insecure permissions")
# Load and validate config
# Implementation here
return config
๐ Monitoring & Observabilityโ
Health Checksโ
class LoggerHealthCheck:
"""Monitor logger health and performance."""
def __init__(self, logger: Logger):
self.logger = logger
self.metrics = {}
def check_health(self) -> dict:
"""Perform comprehensive health check."""
health_status = {
'status': 'healthy',
'timestamp': time.time(),
'metrics': self._collect_metrics(),
'issues': self._identify_issues()
}
if health_status['issues']:
health_status['status'] = 'degraded'
return health_status
def _collect_metrics(self) -> dict:
"""Collect performance and health metrics."""
return {
'message_count': getattr(self.logger, '_message_count', 0),
'error_count': getattr(self.logger, '_error_count', 0),
'last_message_time': getattr(self.logger, '_last_message_time', 0)
}
๐ฏ Contribution Ideasโ
Good First Issuesโ
- Documentation: Improve docstrings and examples
- Test Coverage: Add tests for edge cases
- Error Handling: Improve error messages and handling
- Performance: Optimize specific code paths
Advanced Contributionsโ
- New Sinks: Implement additional output destinations
- Formatters: Create new log message formats
- Filters: Add sophisticated log filtering
- Metrics: Implement logging metrics and monitoring
Research Areasโ
- Async Performance: Investigate async/await optimizations
- Memory Profiling: Deep dive into memory usage patterns
- Concurrency Models: Explore alternative threading approaches
- Compression: Research log compression techniques
๐ Getting in Touchโ
Communication Channelsโ
- GitHub Issues: For bugs and feature requests
- GitHub Discussions: For questions and ideas
- Email: For security issues (see SECURITY.md)
- Discord: Community chat (if available)
Response Timesโ
- Bug Reports: Within 24 hours
- Feature Requests: Within 48 hours
- Security Issues: Within 12 hours
- General Questions: Within 72 hours
๐ Acknowledgmentsโ
Thank you for contributing to Kakashi! Your contributions help make it a better logging library for everyone. Whether you're fixing a typo, adding a feature, or reporting a bug, every contribution matters.
Remember: Quality over quantity. Take your time to understand the codebase and write good, maintainable code. We're here to help you succeed!
Last updated: 2025-08-27 Contributors: [IntegerAlex]