Some checks failed
Deploy to BeePC / deploy (push) Has been cancelled
- Add image-fetcher module for downloading and saving images from various sources. - Create storage module for managing image files, including downloading, verifying integrity, and cleaning up orphaned files. - Develop gallery HTML page for displaying images with sorting and filtering options. - Set up RESTful API routes for image management, including fetching, adding tags, and deleting images. - Introduce setup script for initializing the database and configuring image sources. - Implement a batch process for verifying image integrity and cleaning up old images. - Add setup batch script for easy installation and configuration of the image storage system.
9.7 KiB
9.7 KiB
📸 Image Storage System Guide
This guide covers the long-term image storage solution integrated into HomeBase. It provides reliable, organized storage for images pulled from web services with built-in corruption detection and ML-friendly tagging.
Features
- Automatic Image Fetching: Pull images from web services every 2-3 minutes
- Corruption Detection: SHA256 checksums verify data integrity
- Tag-Based Organization: Tag images for machine learning model training
- Efficient Storage: File-based storage with SQLite metadata database
- RESTful API: Complete API for image management and queries
- Scalable: Designed to handle thousands of images
Architecture
┌─────────────────────────────────────────────────────┐
│ Image Fetcher Service │
│ (Runs every 2-3 minutes, pull from web service) │
└────────┬────────────────────────────────────────────┘
│
├─────┬──────────────┬──────────────┐
│ │ │ │
▼ ▼ ▼ ▼
Download Hash Store File Insert Metadata
Image (SHA256) (data/images) (SQLite)
│
└──────────────────────────────┘
│
┌─────────▼──────────┐
│ REST API Endpoints │
│ - List Images │
│ - Add Tags │
│ - Search/Filter │
│ - Verify Integrity │
└────────────────────┘
Quick Start
1. Install Dependencies
npm install
2. Initialize the System
node setup.js init
This will:
- Create the SQLite database
- Set up data directory structure
- Load configuration
- Test the system
3. Configure Image Sources
Edit image-sources.json:
{
"sources": [
{
"name": "Webcam Feed",
"url": "https://your-service.com/image.jpg",
"tags": ["webcam", "monitoring"],
"enabled": true
}
],
"fetchInterval": 0.033
}
Fetch Intervals (edit fetchInterval to customize):
0.0167= 1 second0.033= 2 seconds (recommended for fast updates)0.05= 3 seconds2.5= 2.5 minutes (original default)
4. Start the Server
npm start
The system will automatically start fetching images at the configured interval.
Configuration
image-sources.json
{
"sources": [
{
"name": "Example Source",
"url": "https://example.com/image.jpg",
"tags": ["tag1", "tag2"],
"enabled": true
}
],
"fetchInterval": 0.033
}
Options:
fetchInterval: Minutes between fetch cycles. Use decimals for sub-minute intervals:0.0167= 1 second0.033= 2 seconds0.05= 3 seconds0.167= 10 seconds1= 1 minute2.5= 2.5 minutes (original default)
API Endpoints
List Images
GET /api/images?page=1&pageSize=50&sort=fetched_at&order=DESC
GET /api/images?tag=webcam # Filter by tag
GET /api/images?sourceUrl=https://... # Filter by source
Response:
{
"success": true,
"images": [
{
"id": 1,
"filename": "image_1234567890_abc123.jpg",
"source_url": "https://...",
"filesize": 102400,
"file_hash": "sha256hash",
"fetched_at": "2023-01-01T12:00:00Z",
"tags": ["webcam", "monitoring"]
}
],
"pagination": {
"page": 1,
"pageSize": 50,
"total": 240,
"pages": 5
}
}
Get Image Details
GET /api/images/{id}
Download Image
GET /api/images/{id}/download
Fetch New Image
POST /api/images
Content-Type: application/json
{
"source_url": "https://example.com/image.jpg",
"tags": ["tag1", "tag2"]
}
Add Tags to Image
POST /api/images/{id}/tags
Content-Type: application/json
{
"tags": ["newtag1", "newtag2"]
}
List All Tags
GET /api/tags
Response:
{
"success": true,
"tags": ["webcam", "monitoring", "test", "example"]
}
Storage Statistics
GET /api/stats
Response:
{
"success": true,
"stats": {
"imageCount": 240,
"totalSize": 24576000,
"totalSizeGB": "0.02",
"fileCount": 240
}
}
Verify Image Integrity
POST /api/verify
This checks all images for corruption using their stored checksums.
Cleanup Old Images
POST /api/cleanup
Content-Type: application/json
{
"daysOld": 30
}
Delete Image
DELETE /api/images/{id}
Corruption Detection
The system uses SHA256 checksums to detect corruption:
- Storage: When an image is saved, its SHA256 hash is calculated and stored
- Verification: The
/api/verifyendpoint re-hashes all files and compares with stored hashes - Marking: Corrupted images are marked in the database and excluded from queries
- Recovery: Corrupted files can be re-fetched using the source URL
Manual Verification
curl -X POST http://localhost:3001/api/verify
Tagging for ML Training
Tags are essential for organizing training datasets:
# Add images with training tags
POST /api/images
{
"source_url": "https://...",
"tags": ["dataset_v1", "labeled", "weather-sunny"]
}
# Query all images with specific tag
GET /api/images?tag=weather-sunny
# Get tag statistics
GET /api/tags
Database Schema
Images Table
| Column | Type | Description |
|---|---|---|
| id | INTEGER | Primary key |
| filename | TEXT | Unique filename |
| source_url | TEXT | Original image URL |
| file_path | TEXT | Local file path |
| filesize | INTEGER | File size in bytes |
| file_hash | TEXT | SHA256 hash |
| mime_type | TEXT | Content type |
| fetched_at | DATETIME | When image was fetched |
| is_corrupted | BOOLEAN | Corruption flag |
Tags Table
| Column | Type | Description |
|---|---|---|
| id | INTEGER | Primary key |
| image_id | INTEGER | Foreign key to images |
| tag | TEXT | Tag text |
| created_at | DATETIME | When tag was added |
File Structure
homebase/
├── server.js # Main server
├── package.json # Dependencies
├── setup.js # Setup script
├── image-sources.json # Configuration
├── lib/
│ ├── database.js # SQLite operations
│ ├── storage.js # File storage operations
│ └── image-fetcher.js # Image fetching service
├── routes/
│ └── images.js # API routes
└── data/ # Created at runtime
├── homebase.db # SQLite database
└── images/ # Stored image files
Maintenance
Manual Setup
# Initialize system
node setup.js init
# Test fetch
node setup.js test https://example.com/image.jpg tag1,tag2
# View configuration
node setup.js config
Regular Tasks
# Daily: Verify integrity
curl -X POST http://localhost:3001/api/verify
# Weekly: Cleanup old images (30+ days)
curl -X POST http://localhost:3001/api/cleanup -H "Content-Type: application/json" -d '{"daysOld": 30}'
# Monitor storage
curl http://localhost:3001/api/stats
Performance Tips
- Pagination: Use
pageSize=50or less for large datasets - Tagging: Use consistent tag names for easier filtering
- Cleanup: Run cleanup weekly to manage storage
- Verification: Run verification monthly to detect issues early
Troubleshooting
No images fetching?
- Check
image-sources.json- ensureenabled: true - Verify URL is accessible
- Check server logs for errors
- Test fetch:
node setup.js test https://url.jpg
High storage usage?
# Check statistics
curl http://localhost:3001/api/stats
# Cleanup images older than 7 days
curl -X POST http://localhost:3001/api/cleanup \
-H "Content-Type: application/json" \
-d '{"daysOld": 7}'
Corrupted files detected?
# Verify all images
curl -X POST http://localhost:3001/api/verify
# Get list of corrupted images
curl 'http://localhost:3001/api/images?corrupted=true'
# Re-fetch if source still available
POST /api/images with same source_url
Advanced Usage
Export Dataset for ML Training
# Get all images with specific tag
curl 'http://localhost:3001/api/images?tag=weather-sunny&pageSize=999' \
-o dataset.json
Monitor Fetch Status
curl http://localhost:3001/api/fetcher/status
Batch Operations
# Add tags to multiple images (via API loop)
for image_id in 1 2 3 4 5; do
curl -X POST http://localhost:3001/api/images/$image_id/tags \
-H "Content-Type: application/json" \
-d '{"tags": ["batch-import"]}'
done
Security Considerations
- Database Access: SQLite database is file-based; protect with filesystem permissions
- Image Storage: Protect
data/imagesdirectory from unauthorized access - API Security: Consider adding authentication for production use
- File Validation: System validates MIME types and file sizes
Performance Metrics
- Fetch Time: ~1-5 seconds per image (network dependent)
- Database Queries: <100ms for typical queries
- Verification: ~50ms per image
- Storage: ~1KB overhead per image in database
For more information, check the API documentation or run node setup.js help.