A comprehensive multi-modal AI platform that processes text, images, audio, and video to provide intelligent solutions. Built with a student-friendly approach using modern AI/ML technologies.
- Multi-Modal Processing: Text, Image, Audio, and Video analysis
- Vector Database Integration: Weaviate for semantic search and similarity
- Cloud Storage: Google Cloud Storage for file management
- RESTful API: FastAPI-based backend with comprehensive endpoints
- Real-time Processing: Async processing with background tasks
- Comprehensive Testing: Full test suite with pytest
- Student-Friendly: Designed for learning with detailed documentation
- Python 3.9+
- pip package manager
- Git
- Google Cloud Platform account (free tier)
- Weaviate Cloud account (free tier)
-
Clone the repository
git clone <repository-url> cd "AI Projects/multi Model AI"
-
Create virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Google Cloud Storage Setup
- Create a project at Google Cloud Console
- Enable Cloud Storage API
- Create a bucket named
multimodal-student-storage - Download service account key as
gcs-key.json
-
Weaviate Cloud Setup
- Sign up at Weaviate Cloud
- Create a free cluster named
multimodal-student - Note your cluster URL and API key
-
Update environment variables
# API Configuration API_HOST=localhost API_PORT=8000 DEBUG=True # Database DATABASE_URL=sqlite:///./app.db # Cloud Storage GCS_BUCKET_NAME=multimodal-student-storage GCS_KEY_PATH=./gcs-key.json # Weaviate Vector Database WEAVIATE_URL=https://your-cluster.weaviate.network WEAVIATE_API_KEY=your-api-key # AI Model Configuration OPENAI_API_KEY=your_openai_api_key MODEL_CONFIG_PATH=config/models.json
-
Initialize the database
python backend/database.py
-
Start the development server
uvicorn backend.main:app --reload
-
Access the API documentation
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
-
Test the API
# Health check curl http://localhost:8000/health # Process an image (replace with your API key) curl -X 'POST' \ 'http://localhost:8000/process-image' \ -H 'X-API-Key: student-api-key-123' \ -F 'file=@/path/to/your/test.jpg'
-
Run tests
pytest
multi Model AI/
├── backend/
│ ├── main.py # FastAPI application entry point
│ ├── database.py # SQLite database setup and operations
│ ├── vector_db.py # Weaviate vector database integration
│ ├── storage.py # Google Cloud Storage operations
│ ├── auth.py # Authentication and authorization
│ └── services/
│ ├── text_processing.py # Text analysis and embeddings
│ ├── image_processing.py # Image analysis and description
│ ├── audio_processing.py # Audio transcription and analysis
│ └── video_processing.py # Video analysis and processing
├── docs/
│ └── design.md # System architecture and design decisions
├── tests/
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ └── performance/ # Performance tests
├── config/
│ └── models.json # AI model configurations
├── requirements.txt # Python dependencies
├── README.md # This file
├── .env.example # Environment variables template
└── .gitignore # Git ignore rules
Create a .env file with the following variables:
# API Configuration
API_HOST=localhost
API_PORT=8000
DEBUG=True
# Database
DATABASE_URL=sqlite:///./app.db
# Cloud Storage
GCS_BUCKET_NAME=multimodal-student-storage
GCS_KEY_PATH=./gcs-key.json
# Weaviate Vector Database
WEAVIATE_URL=https://your-cluster.weaviate.network
WEAVIATE_API_KEY=your-api-key
# AI Model Configuration
OPENAI_API_KEY=your_openai_api_key
MODEL_CONFIG_PATH=config/models.json
# Security
SECRET_KEY=your-secret-key-here
API_KEY=student-api-key-123Create config/models.json to configure AI models:
{
"text": {
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"sentiment_model": "distilbert-base-uncased-finetuned-sst-2-english",
"summarization_model": "sshleifer/distilbart-cnn-12-6"
},
"image": {
"description_model": "microsoft/git-base-coco",
"object_detection_model": "facebook/detr-resnet-50"
},
"audio": {
"transcription_model": "openai/whisper-base",
"sentiment_model": "facebook/wav2vec2-base"
}
}- Design Documentation - System architecture and design decisions
- API Documentation - Interactive API documentation
- Testing Guide - Testing strategies and guidelines
# Run all tests
pytest
# Run with coverage
pytest --cov=backend
# Run specific test categories
pytest tests/unit/
pytest tests/integration/
pytest tests/performance/- Unit Tests: Test individual components in isolation
- Integration Tests: Test component interactions
- Performance Tests: Test system performance under load
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 style guidelines
- Write tests for new features
- Update documentation for API changes
- Use type hints in Python code
This project is licensed under the MIT License - see the LICENSE file for details.
If you encounter any issues or have questions:
- Check the documentation
- Search existing issues
- Create a new issue with detailed information