mcphub/docs/features/output-compression.mdx

---
title: 'Output Compression'
description: 'Reduce token consumption by compressing MCP tool outputs'
---

# Output Compression

MCPHub provides an AI-powered compression mechanism to reduce token consumption from MCP tool outputs. This feature is particularly useful when dealing with large outputs that can significantly impact system efficiency and scalability.

## Overview

The compression feature uses a lightweight AI model (by default, `gpt-4o-mini`) to intelligently compress MCP tool outputs while preserving all essential information. This can help:

- **Reduce token overhead** by compressing verbose tool information
- **Lower operational costs** associated with token consumption
- **Improve performance** for downstream processing
- **Better resource utilization** in resource-constrained environments

## Configuration

Add the compression configuration to your `systemConfig` section in `mcp_settings.json`:

```json
{
  "systemConfig": {
    "compression": {
      "enabled": true,
      "model": "gpt-4o-mini",
      "maxInputTokens": 100000,
      "targetReductionRatio": 0.5
    }
  }
}
```

### Configuration Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `enabled` | boolean | `false` | Enable or disable output compression |
| `model` | string | `"gpt-4o-mini"` | AI model to use for compression |
| `maxInputTokens` | number | `100000` | Maximum input tokens for compression |
| `targetReductionRatio` | number | `0.5` | Target size reduction ratio (0.0-1.0) |

## Requirements

Output compression requires:

1. An OpenAI API key configured in the smart routing settings
2. The compression feature must be explicitly enabled

### Setting up OpenAI API Key

Configure your OpenAI API key using environment variables or system configuration:

**Environment Variable:**
```bash
export OPENAI_API_KEY=your-api-key
```

**Or in systemConfig:**
```json
{
  "systemConfig": {
    "smartRouting": {
      "openaiApiKey": "your-api-key",
      "openaiApiBaseUrl": "https://api.openai.com/v1"
    }
  }
}
```

## How It Works

1. **Content Size Check**: When a tool call completes, the compression service checks if the output is large enough to benefit from compression (threshold is 10% of `maxInputTokens` or 1000 tokens, whichever is smaller)

2. **AI Compression**: If the content exceeds the threshold, it's sent to the configured AI model with instructions to compress while preserving essential information

3. **Size Validation**: The compressed result is compared with the original; if compression didn't reduce the size, the original content is used

4. **Error Handling**: If compression fails for any reason, the original content is returned unchanged

## Fallback Mechanism

The compression feature includes graceful degradation for several scenarios:

- **Compression disabled**: Original content is returned
- **No API key**: Original content is returned with a warning
- **Small content**: Content below threshold is not compressed
- **API errors**: Original content is returned on any API failure
- **Error responses**: Tool error responses are never compressed
- **Non-text content**: Images and other media types are preserved as-is

## Best Practices

1. **Start with defaults**: The default configuration provides a good balance between compression and quality

2. **Monitor results**: Review compressed outputs to ensure important information isn't lost

3. **Adjust threshold**: If you have consistently large outputs, consider lowering `targetReductionRatio` for more aggressive compression

4. **Use efficient models**: The default `gpt-4o-mini` provides a good balance of cost and quality; switch to `gpt-4o` if you need higher quality compression

## Limitations

- Compression adds latency due to the AI API call
- API costs apply for each compression operation
- Very short outputs won't be compressed (below threshold)
- Binary/non-text content is not compressed