Skip to content

Claude Code Batch API Complete Guide: Large-scale Processing and CI/CD Automation Implementation

Batch Processing

What You Can Achieve

  • High-Speed Processing

    Large-scale code analysis and processing without interaction

  • CI/CD Integration

    Pipeline integration for complete automation

  • Batch Operations

    Simultaneous processing and conversion of multiple files

  • Result Analysis

    Structured output for easy post-processing

📖 Basic Concepts of Batch Processing

Claude Code's batch processing (headless mode) allows you to execute AI functionality directly from the command line without using the interactive UI. By integrating it into CI/CD pipelines and automation scripts, you can efficiently execute large-scale processing tasks.

Basic Usage

# Basic batch execution
claude -p "Find all files containing the variable 'userId'"

# Data processing using pipes
cat error.log | claude -p "Summarize the key errors in this log file"

# Save output to file
claude -p "Review this code for security vulnerabilities" < app.py > security_report.txt

🚀 Usage Patterns

1. Code Analysis and Quality Checking

#!/bin/bash
# code_quality_check.sh

echo "🔍 Starting code quality analysis..."

# Security audit
find src -name "*.py" -exec claude -p "Analyze this file for security vulnerabilities and provide a severity rating" {} \; > security_audit.txt

# Code complexity analysis
for file in src/**/*.py; do
    echo "Analyzing: $file"
    claude -p "Analyze the complexity of this code and suggest improvements" < "$file" >> complexity_report.txt
done

# Dependency analysis
claude -p "Analyze package.json for outdated dependencies and security issues" < package.json > dependency_report.txt

echo "✅ Code quality analysis complete"

2. Structured Output

# JSON format output
claude -p "Analyze this API and return the result in JSON format" --output-format json < api.py

# Streaming output
claude -p "Process this large dataset incrementally" --output-format stream-json < large_dataset.csv

3. Large-scale Refactoring

#!/bin/bash
# mass_refactoring.sh

REFACTOR_RULE="Convert class-based components to functional components with hooks"

find src/components -name "*.tsx" | while read file; do
    echo "Processing: $file"

    # Create backup
    cp "$file" "$file.backup"

    # Execute refactoring
    claude -p "$REFACTOR_RULE" < "$file" > "$file.new"

    # Verify results
    if [[ -s "$file.new" ]]; then
        mv "$file.new" "$file"
        echo "✅ Refactored: $file"
    else
        echo "❌ Failed: $file"
        rm "$file.new"
    fi
done

🔧 CI/CD Integration

1. GitHub Actions Implementation

name: Code Analysis Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    types: [opened, synchronize]

jobs:
  claude-analysis:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Install Claude Code
        run: |
          curl -sSL https://claude.ai/install.sh | sh
          echo "$HOME/.claude/bin" >> $GITHUB_PATH

      - name: Security Analysis
        run: |
          echo "🔒 Running security analysis..."
          find . -name "*.py" -exec claude -p "Check for security vulnerabilities" {} \; > security_report.txt

      - name: Code Quality Check
        run: |
          echo "📊 Running code quality analysis..."
          claude -p "Analyze overall code quality and provide recommendations" --output-format json > quality_report.json

      - name: Generate Summary
        run: |
          echo "📝 Generating summary report..."
          claude -p "Create a summary report from these analysis results" \
            --input-files security_report.txt,quality_report.json \
            --output-format markdown > analysis_summary.md

      - name: Upload Reports
        uses: actions/upload-artifact@v4
        with:
          name: analysis-reports
          path: |
            security_report.txt
            quality_report.json
            analysis_summary.md

2. GitLab CI Implementation

# .gitlab-ci.yml
stages:
  - analysis
  - report

code_analysis:
  stage: analysis
  image: ubuntu:latest
  before_script:
    - apt-get update && apt-get install -y curl
    - curl -sSL https://claude.ai/install.sh | sh
    - export PATH="$HOME/.claude/bin:$PATH"
  script:
    - echo "Running Claude Code analysis..."
    - claude -p "Analyze this codebase for maintainability issues" --output-format json > maintainability.json
    - claude -p "Check for performance bottlenecks" --output-format json > performance.json
    - claude -p "Review test coverage and suggest improvements" --output-format json > testing.json
  artifacts:
    reports:
      junit: "*.json"
    paths:
      - "*.json"
    expire_in: 1 week

generate_report:
  stage: report
  dependencies:
    - code_analysis
  script:
    - claude -p "Create a comprehensive report from these analysis files" \
        --input-files maintainability.json,performance.json,testing.json \
        --output-format html > final_report.html
  artifacts:
    paths:
      - final_report.html

📊 Advanced Usage Examples

1. Internationalization Automation

#!/bin/bash
# i18n_automation.sh

# Detect files requiring translation
find src -name "*.ts" -o -name "*.tsx" | xargs grep -l "t(" > translatable_files.txt

# Extract translation keys
while read file; do
    echo "Extracting i18n keys from: $file"
    claude -p "Extract all i18n translation keys from this file and output as JSON" < "$file" >> i18n_keys.json
done < translatable_files.txt

# Generate translation files
for lang in en ja ko zh; do
    echo "Generating translations for: $lang"
    claude -p "Translate these i18n keys to $lang language" < i18n_keys.json > "locales/$lang.json"
done

2. Automatic Documentation Generation

#!/bin/bash
# doc_generation.sh

# Generate API documentation
find src/api -name "*.py" | while read file; do
    echo "Generating API docs for: $file"
    claude -p "Generate comprehensive API documentation in OpenAPI format" < "$file" > "docs/api/$(basename "$file" .py).yaml"
done

# Automatically update README
claude -p "Update README.md based on recent code changes and new features" \
    --input-files $(find src -name "*.py" | head -10 | tr '\n' ',') \
    --output-format markdown > README_updated.md

# Generate changelog
git log --oneline -10 | claude -p "Generate a user-friendly changelog from these git commits" > CHANGELOG.md

3. Automatic Test Case Generation

#!/bin/bash
# test_generation.sh

# Generate unit tests
find src -name "*.py" -not -path "*/tests/*" | while read file; do
    test_file="tests/test_$(basename "$file")"
    echo "Generating tests for: $file"

    claude -p "Generate comprehensive unit tests for this Python file using pytest" < "$file" > "$test_file"

    # Validate generated tests
    if python -m py_compile "$test_file"; then
        echo "✅ Valid test file generated: $test_file"
    else
        echo "❌ Invalid test file: $test_file"
        rm "$test_file"
    fi
done

# Generate integration tests
claude -p "Generate integration tests for the entire application" \
    --input-files $(find src -name "*.py" | grep -E "(main|app|api)" | tr '\n' ',') \
    --output-format python > tests/integration/test_full_app.py

⚡ Performance Optimization

1. Parallel Processing

#!/bin/bash
# parallel_processing.sh

# Implement parallel processing
process_file() {
    local file="$1"
    local output_dir="$2"

    echo "Processing: $file"
    claude -p "Optimize this code for performance" < "$file" > "$output_dir/$(basename "$file")"
}

# Execute in parallel
export -f process_file
find src -name "*.py" | parallel -j 4 process_file {} optimized/

echo "✅ Parallel processing completed"

2. Batch Size Optimization

#!/bin/bash
# batch_optimization.sh

# Split files into batches
find src -name "*.py" | split -l 10 - batch_

# Process each batch
for batch in batch_*; do
    echo "Processing batch: $batch"

    # Combine files in batch
    cat $(cat "$batch") > combined_input.txt

    # Batch processing
    claude -p "Analyze these Python files and provide optimization suggestions" < combined_input.txt > "results/$(basename "$batch").txt"

    # Clean up temporary files
    rm combined_input.txt
done

# Clean up batch files
rm batch_*

🔍 Error Handling

1. Robust Error Processing

#!/bin/bash
# robust_processing.sh

set -euo pipefail  # Stop on error

# Log configuration
LOG_FILE="claude_batch.log"
exec 1> >(tee -a "$LOG_FILE")
exec 2> >(tee -a "$LOG_FILE" >&2)

# Error handling function
handle_error() {
    local line_number="$1"
    echo "❌ Error occurred at line: $line_number"
    echo "📝 Check log file: $LOG_FILE"
    exit 1
}

trap 'handle_error $LINENO' ERR

# Process execution
process_files() {
    local max_retries=3
    local retry_count=0

    while [[ $retry_count -lt $max_retries ]]; do
        if claude -p "Process these files" --timeout 300 < input.txt > output.txt; then
            echo "✅ Processing completed successfully"
            return 0
        else
            ((retry_count++))
            echo "⚠️ Attempt $retry_count failed, retrying..."
            sleep 5
        fi
    done

    echo "❌ Processing failed after $max_retries attempts"
    return 1
}

# Main processing
main() {
    echo "🚀 Starting batch processing..."

    # Pre-processing
    if [[ ! -f "input.txt" ]]; then
        echo "❌ Input file not found"
        exit 1
    fi

    # Execute processing
    process_files

    # Post-processing
    if [[ -f "output.txt" && -s "output.txt" ]]; then
        echo "✅ Batch processing completed successfully"
    else
        echo "❌ Output file is empty or missing"
        exit 1
    fi
}

main "$@"

📈 Performance Measurement and Monitoring

1. Performance Monitoring

#!/bin/bash
# performance_monitoring.sh

# Processing start time
start_time=$(date +%s)

# Resource usage monitoring
monitor_resources() {
    while true; do
        echo "$(date): CPU: $(top -l 1 | grep "CPU usage" | cut -d'%' -f1 | cut -d' ' -f3), Memory: $(free -h | grep Mem | awk '{print $3}')" >> resource_usage.log
        sleep 10
    done
}

# Start monitoring in background
monitor_resources &
monitor_pid=$!

# Main processing
claude -p "Large scale code analysis" --input-files $(find src -name "*.py" | tr '\n' ',') > analysis_results.txt

# Stop monitoring
kill $monitor_pid

# Calculate processing time
end_time=$(date +%s)
duration=$((end_time - start_time))

echo "📊 Processing completed in ${duration} seconds"
echo "📈 Performance metrics saved to resource_usage.log"

2. Result Quality Assessment

#!/bin/bash
# quality_assessment.sh

# Evaluate output quality
evaluate_output() {
    local output_file="$1"

    # File size check
    if [[ ! -s "$output_file" ]]; then
        echo "❌ Output file is empty"
        return 1
    fi

    # Content verification
    if grep -q "error\|failed\|exception" "$output_file"; then
        echo "⚠️ Output contains error indicators"
        return 1
    fi

    # Structured data verification
    if [[ "$output_file" == *.json ]]; then
        if ! jq empty "$output_file" 2>/dev/null; then
            echo "❌ Invalid JSON format"
            return 1
        fi
    fi

    echo "✅ Output quality check passed"
    return 0
}

# Evaluate all output files
for output in results/*.txt results/*.json; do
    echo "Evaluating: $output"
    evaluate_output "$output"
done

Best Practices

1. Performance Considerations

  • Use parallel processing for independent operations
  • Optimize batch sizes based on available resources
  • Implement timeouts to prevent hanging processes
  • Monitor resource usage during execution

2. Error Recovery

  • Implement retry mechanisms for transient failures
  • Create comprehensive logging for debugging
  • Validate outputs before proceeding to next steps
  • Backup critical data before processing

3. Security

  • Sanitize inputs before processing
  • Validate file permissions and access rights
  • Use secure temporary directories
  • Audit command execution in production environments