ABC Project (AI Benchmark Cluster)
[](https://gitlab.com/ai9804501/abc/-/pipelines)
Loading actions...
Skill content
Main instructions and any bundled files for this skill.
ABC Project (AI Benchmark Cluster)
Overview
ABC (AI Benchmark Cluster) is an advanced LLM benchmarking platform that evaluates AI models against human educational standards. The system provides comprehensive testing across multiple subjects and educational levels, from elementary school to PhD, using Ollama for model execution.
Key Features
-
Educational Level Benchmarking: Compare LLM performance against:
- 5th Grade Level
- High School Level
- Masters Level
- PhD Level
-
Subject Areas:
- Mathematics
- Computer Science
- Problem Solving
- General Reasoning
- Grammar
- Creative Writing
-
Automated Documentation: Self-generating performance reports and analysis through GitLab CI/CD pipelines
-
Pass/Fail Grading: Objective evaluation criteria for each educational level
Directory Structure
abc/
├── docs/ # Documentation and benchmark results
│ ├── results/ # Auto-generated benchmark results
│ ├── analysis/ # Performance analysis reports
│ └── comparisons/ # Educational level comparisons
├── src/ # Source code
│ ├── analysis/ # Analysis and metrics
│ ├── benchmarking/ # Core benchmarking system
│ ├── costs/ # Resource usage tracking
│ ├── database/ # Results storage
│ ├── pipeline/ # CI/CD pipeline integration
│ ├── runner/ # Ollama model runners
│ └── testing/ # Test suites by subject
├── tests/ # Test framework
└── templates/ # Report templates
Requirements
Development Environment
- WSL (Windows Subsystem for Linux)
- Python 3.12 or higher with
pyenvanduv - Ollama
- Docker & Docker Compose
- GitLab Runner (for CI/CD)
glabCLI toolkubectlandhelmfor Kubernetes deployments
Environment Validation
Run the environment check script to verify your setup:
./scripts/check_dev.sh
This script will validate the installation of all required tools and provide installation instructions for any missing components.
Recommended Setup
The recommended way to run ABC is using Docker Compose, which ensures consistent environment and dependencies across all platforms.
Installation
Using Docker Compose (Recommended)
- Clone the repository:
git clone https://gitlab.com/ai9804501/abc.git
cd abc
- Build and start services:
docker-compose up -d
Manual Installation (Alternative)
- Clone the repository:
git clone https://gitlab.com/ai9804501/abc.git
cd abc
- Install Ollama:
curl https://ollama.ai/install.sh | sh
- Ensure Python 3.12 is installed:
python3 --version # Should output Python 3.12.x
- Install uv:
pip install uv
- Create virtual environment and install dependencies:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"
Running Benchmarks
Using Docker Compose (Recommended)
- Run benchmarks:
docker-compose exec app python -m src.pipeline.cli run-benchmarks
Manual Method
- Start Ollama service:
ollama serve
- Pull required models:
ollama pull llama2
# Add other models as needed
- Run benchmarks:
python -m src.pipeline.cli run-benchmarks
Benchmark Reports
Reports are automatically generated in the GitLab CI pipeline and can be found in:
- Pipeline artifacts under
docs/results/ - Project wiki (auto-updated)
- Generated site at
pages/benchmarks/
Sample Report Structure
- Overall Performance Summary
- Educational Level Comparisons
- Subject-Specific Analysis
- Pass/Fail Statistics
- Resource Usage Metrics
Contributing
- Create a new branch:
git checkout -b feature/your-feature-name
- Run tests:
pytest
- Submit merge request
DevOps Setup
CI/CD Pipeline
The project uses GitLab CI/CD with the following stages:
- Setup: Prepares the Python environment
- Test: Runs unit and integration tests
- Benchmark: Executes model benchmarks
- Analyze: Processes benchmark results
- Document: Generates documentation and updates wiki
- Build: Creates Docker images
- Deploy: Deploys to Kubernetes environments
- Cleanup: Manages environment resources
Kubernetes Deployment
The application can be deployed to Kubernetes using Helm:
- Configure kubectl context:
kubectl config use-context your-cluster-context
- Deploy to staging:
# Kubernetes deployment configuration has been removed
# Please refer to Docker Compose for deployment
Note: Kubernetes deployment configuration has been removed from this project. Please use Docker Compose for deployment as described above.
GitLab Configuration
Required GitLab CI/CD variables:
KUBE_CONFIG: Base64 encoded kubeconfig fileCI_REGISTRY_USER: GitLab registry usernameCI_REGISTRY_PASSWORD: GitLab registry passwordGITLAB_TOKEN: Token for wiki updates
Pre-commit Hooks
Install pre-commit hooks to ensure code quality:
uv pip install pre-commit
pre-commit install
This will run linters and formatters before each commit.
License
MIT License - see LICENSE file for details
Related Skills
Frontend Typescript Linting.mdc
TypeScript and ESLint rules that MUST be followed when creating, modifying, or reviewing any file under apps/frontend/, including .ts, .tsx, .js, and .jsx files. Also apply when discussing frontend li...
2. Apply Deepthink Protocol (reason about dependencies
risks