Claude Code Batch API Complete Guide: Large-scale Processing and CI/CD Automation Implementation¶
What You Can Achieve¶
High-Speed Processing
Large-scale code analysis and processing without interaction
CI/CD Integration
Pipeline integration for complete automation
Batch Operations
Simultaneous processing and conversion of multiple files
Result Analysis
Structured output for easy post-processing
📖 Basic Concepts of Batch Processing¶
Claude Code's batch processing (headless mode) allows you to execute AI functionality directly from the command line without using the interactive UI. By integrating it into CI/CD pipelines and automation scripts, you can efficiently execute large-scale processing tasks.
Basic Usage¶
# Basic batch execution
claude -p "Find all files containing the variable 'userId'"
# Data processing using pipes
cat error.log | claude -p "Summarize the key errors in this log file"
# Save output to file
claude -p "Review this code for security vulnerabilities" < app.py > security_report.txt
🚀 Usage Patterns¶
1. Code Analysis and Quality Checking¶
#!/bin/bash
# code_quality_check.sh
echo "🔍 Starting code quality analysis..."
# Security audit
find src -name "*.py" -exec claude -p "Analyze this file for security vulnerabilities and provide a severity rating" {} \; > security_audit.txt
# Code complexity analysis
for file in src/**/*.py; do
echo "Analyzing: $file"
claude -p "Analyze the complexity of this code and suggest improvements" < "$file" >> complexity_report.txt
done
# Dependency analysis
claude -p "Analyze package.json for outdated dependencies and security issues" < package.json > dependency_report.txt
echo "✅ Code quality analysis complete"
2. Structured Output¶
# JSON format output
claude -p "Analyze this API and return the result in JSON format" --output-format json < api.py
# Streaming output
claude -p "Process this large dataset incrementally" --output-format stream-json < large_dataset.csv
3. Large-scale Refactoring¶
#!/bin/bash
# mass_refactoring.sh
REFACTOR_RULE="Convert class-based components to functional components with hooks"
find src/components -name "*.tsx" | while read file; do
echo "Processing: $file"
# Create backup
cp "$file" "$file.backup"
# Execute refactoring
claude -p "$REFACTOR_RULE" < "$file" > "$file.new"
# Verify results
if [[ -s "$file.new" ]]; then
mv "$file.new" "$file"
echo "✅ Refactored: $file"
else
echo "❌ Failed: $file"
rm "$file.new"
fi
done
🔧 CI/CD Integration¶
1. GitHub Actions Implementation¶
name: Code Analysis Pipeline
on:
push:
branches: [main, develop]
pull_request:
types: [opened, synchronize]
jobs:
claude-analysis:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install Claude Code
run: |
curl -sSL https://claude.ai/install.sh | sh
echo "$HOME/.claude/bin" >> $GITHUB_PATH
- name: Security Analysis
run: |
echo "🔒 Running security analysis..."
find . -name "*.py" -exec claude -p "Check for security vulnerabilities" {} \; > security_report.txt
- name: Code Quality Check
run: |
echo "📊 Running code quality analysis..."
claude -p "Analyze overall code quality and provide recommendations" --output-format json > quality_report.json
- name: Generate Summary
run: |
echo "📝 Generating summary report..."
claude -p "Create a summary report from these analysis results" \
--input-files security_report.txt,quality_report.json \
--output-format markdown > analysis_summary.md
- name: Upload Reports
uses: actions/upload-artifact@v4
with:
name: analysis-reports
path: |
security_report.txt
quality_report.json
analysis_summary.md
2. GitLab CI Implementation¶
# .gitlab-ci.yml
stages:
- analysis
- report
code_analysis:
stage: analysis
image: ubuntu:latest
before_script:
- apt-get update && apt-get install -y curl
- curl -sSL https://claude.ai/install.sh | sh
- export PATH="$HOME/.claude/bin:$PATH"
script:
- echo "Running Claude Code analysis..."
- claude -p "Analyze this codebase for maintainability issues" --output-format json > maintainability.json
- claude -p "Check for performance bottlenecks" --output-format json > performance.json
- claude -p "Review test coverage and suggest improvements" --output-format json > testing.json
artifacts:
reports:
junit: "*.json"
paths:
- "*.json"
expire_in: 1 week
generate_report:
stage: report
dependencies:
- code_analysis
script:
- claude -p "Create a comprehensive report from these analysis files" \
--input-files maintainability.json,performance.json,testing.json \
--output-format html > final_report.html
artifacts:
paths:
- final_report.html
📊 Advanced Usage Examples¶
1. Internationalization Automation¶
#!/bin/bash
# i18n_automation.sh
# Detect files requiring translation
find src -name "*.ts" -o -name "*.tsx" | xargs grep -l "t(" > translatable_files.txt
# Extract translation keys
while read file; do
echo "Extracting i18n keys from: $file"
claude -p "Extract all i18n translation keys from this file and output as JSON" < "$file" >> i18n_keys.json
done < translatable_files.txt
# Generate translation files
for lang in en ja ko zh; do
echo "Generating translations for: $lang"
claude -p "Translate these i18n keys to $lang language" < i18n_keys.json > "locales/$lang.json"
done
2. Automatic Documentation Generation¶
#!/bin/bash
# doc_generation.sh
# Generate API documentation
find src/api -name "*.py" | while read file; do
echo "Generating API docs for: $file"
claude -p "Generate comprehensive API documentation in OpenAPI format" < "$file" > "docs/api/$(basename "$file" .py).yaml"
done
# Automatically update README
claude -p "Update README.md based on recent code changes and new features" \
--input-files $(find src -name "*.py" | head -10 | tr '\n' ',') \
--output-format markdown > README_updated.md
# Generate changelog
git log --oneline -10 | claude -p "Generate a user-friendly changelog from these git commits" > CHANGELOG.md
3. Automatic Test Case Generation¶
#!/bin/bash
# test_generation.sh
# Generate unit tests
find src -name "*.py" -not -path "*/tests/*" | while read file; do
test_file="tests/test_$(basename "$file")"
echo "Generating tests for: $file"
claude -p "Generate comprehensive unit tests for this Python file using pytest" < "$file" > "$test_file"
# Validate generated tests
if python -m py_compile "$test_file"; then
echo "✅ Valid test file generated: $test_file"
else
echo "❌ Invalid test file: $test_file"
rm "$test_file"
fi
done
# Generate integration tests
claude -p "Generate integration tests for the entire application" \
--input-files $(find src -name "*.py" | grep -E "(main|app|api)" | tr '\n' ',') \
--output-format python > tests/integration/test_full_app.py
⚡ Performance Optimization¶
1. Parallel Processing¶
#!/bin/bash
# parallel_processing.sh
# Implement parallel processing
process_file() {
local file="$1"
local output_dir="$2"
echo "Processing: $file"
claude -p "Optimize this code for performance" < "$file" > "$output_dir/$(basename "$file")"
}
# Execute in parallel
export -f process_file
find src -name "*.py" | parallel -j 4 process_file {} optimized/
echo "✅ Parallel processing completed"
2. Batch Size Optimization¶
#!/bin/bash
# batch_optimization.sh
# Split files into batches
find src -name "*.py" | split -l 10 - batch_
# Process each batch
for batch in batch_*; do
echo "Processing batch: $batch"
# Combine files in batch
cat $(cat "$batch") > combined_input.txt
# Batch processing
claude -p "Analyze these Python files and provide optimization suggestions" < combined_input.txt > "results/$(basename "$batch").txt"
# Clean up temporary files
rm combined_input.txt
done
# Clean up batch files
rm batch_*
🔍 Error Handling¶
1. Robust Error Processing¶
#!/bin/bash
# robust_processing.sh
set -euo pipefail # Stop on error
# Log configuration
LOG_FILE="claude_batch.log"
exec 1> >(tee -a "$LOG_FILE")
exec 2> >(tee -a "$LOG_FILE" >&2)
# Error handling function
handle_error() {
local line_number="$1"
echo "❌ Error occurred at line: $line_number"
echo "📝 Check log file: $LOG_FILE"
exit 1
}
trap 'handle_error $LINENO' ERR
# Process execution
process_files() {
local max_retries=3
local retry_count=0
while [[ $retry_count -lt $max_retries ]]; do
if claude -p "Process these files" --timeout 300 < input.txt > output.txt; then
echo "✅ Processing completed successfully"
return 0
else
((retry_count++))
echo "⚠️ Attempt $retry_count failed, retrying..."
sleep 5
fi
done
echo "❌ Processing failed after $max_retries attempts"
return 1
}
# Main processing
main() {
echo "🚀 Starting batch processing..."
# Pre-processing
if [[ ! -f "input.txt" ]]; then
echo "❌ Input file not found"
exit 1
fi
# Execute processing
process_files
# Post-processing
if [[ -f "output.txt" && -s "output.txt" ]]; then
echo "✅ Batch processing completed successfully"
else
echo "❌ Output file is empty or missing"
exit 1
fi
}
main "$@"
📈 Performance Measurement and Monitoring¶
1. Performance Monitoring¶
#!/bin/bash
# performance_monitoring.sh
# Processing start time
start_time=$(date +%s)
# Resource usage monitoring
monitor_resources() {
while true; do
echo "$(date): CPU: $(top -l 1 | grep "CPU usage" | cut -d'%' -f1 | cut -d' ' -f3), Memory: $(free -h | grep Mem | awk '{print $3}')" >> resource_usage.log
sleep 10
done
}
# Start monitoring in background
monitor_resources &
monitor_pid=$!
# Main processing
claude -p "Large scale code analysis" --input-files $(find src -name "*.py" | tr '\n' ',') > analysis_results.txt
# Stop monitoring
kill $monitor_pid
# Calculate processing time
end_time=$(date +%s)
duration=$((end_time - start_time))
echo "📊 Processing completed in ${duration} seconds"
echo "📈 Performance metrics saved to resource_usage.log"
2. Result Quality Assessment¶
#!/bin/bash
# quality_assessment.sh
# Evaluate output quality
evaluate_output() {
local output_file="$1"
# File size check
if [[ ! -s "$output_file" ]]; then
echo "❌ Output file is empty"
return 1
fi
# Content verification
if grep -q "error\|failed\|exception" "$output_file"; then
echo "⚠️ Output contains error indicators"
return 1
fi
# Structured data verification
if [[ "$output_file" == *.json ]]; then
if ! jq empty "$output_file" 2>/dev/null; then
echo "❌ Invalid JSON format"
return 1
fi
fi
echo "✅ Output quality check passed"
return 0
}
# Evaluate all output files
for output in results/*.txt results/*.json; do
echo "Evaluating: $output"
evaluate_output "$output"
done
Best Practices¶
1. Performance Considerations¶
- Use parallel processing for independent operations
- Optimize batch sizes based on available resources
- Implement timeouts to prevent hanging processes
- Monitor resource usage during execution
2. Error Recovery¶
- Implement retry mechanisms for transient failures
- Create comprehensive logging for debugging
- Validate outputs before proceeding to next steps
- Backup critical data before processing
3. Security¶
- Sanitize inputs before processing
- Validate file permissions and access rights
- Use secure temporary directories
- Audit command execution in production environments