| Constraint | Naive Behavior | Failure Threshold | | :--- | :--- | :--- | | | Stores entire ZIP in RAM | Typically 128MB - 2GB | | Execution Timeout | Blocks until complete | 30-300 seconds (web servers) | | Disk Space | Uses temp files | /tmp fills up | | Central Directory | Must be written after all file data | Requires seekable storage |
The central directory is the key: a ZIP file’s table of contents is at the end of the file. Most libraries cannot stream it without first knowing all file sizes and CRCs. 4.1 Level 1: Streamed Passthrough (No Compression – "Store" Method) Best for: Already compressed files (JPEG, MP4, PDFs).
@shared_task(bind=True) def generate_large_zip(self, file_paths, job_id): temp_zip = f"/tmp/job_id.zip" with zipfile.ZipFile(temp_zip, 'w', zipfile.ZIP_DEFLATED, allowZip64=True) as zf: for path in file_paths: zf.write(path, os.path.basename(path)) # Upload to S3 s3.upload_file(temp_zip, "my-bucket", f"zips/job_id.zip") return f"https://my-bucket.s3.amazonaws.com/zips/job_id.zip" | Approach | Max ZIP size (practical) | Memory usage | HTTP timeout risk | Client experience | | :--- | :--- | :--- | :--- | :--- | | Naive (buffer) | < 200 MB | O(Size) | High | Immediate fail | | Streamed store | Unlimited* | < 20 MB | Medium (long download) | Progress bar works | | Chunked deflate | Unlimited* | < 100 MB | Medium | Same as above | | Async job | Unlimited (TB) | < 500 MB (worker) | None | Polling required |
res.attachment('download.zip'); archive.pipe(res); // Direct HTTP response stream
(only per-file read buffer). Limitation: Output size ≈ sum of input sizes. Still fails if Content-Length cannot be precomputed. 4.2 Level 2: Chunked Deflate with CRC Precomputation Best for: Text files, logs, or data that needs compression but cannot fit in memory.
Use ZIP’s "store" method (deflation level 0). The CRC and size are known per file before writing.
const createWriteStream = require('fs'); const archiver = require('archiver'); // Supports streaming const archive = archiver('zip', zlib: level: 0 , // Store, not compress forceLocalTime: true );
from zipstream import ZipStream import zlib zip_file = ZipStream(mode='w', compress_type=zlib.Z_DEFAULT_COMPRESSION) for file_path in huge_file_list: zip_file.add(file_path, arcname=os.path.basename(file_path)) Stream to HTTP response response = HttpResponse(zip_file, content_type='application/zip') response['Content-Disposition'] = 'attachment; filename="archive.zip"' return response
Total Size Of Requested Files Is Too Large For Zip-on-the-fly -
| Constraint | Naive Behavior | Failure Threshold | | :--- | :--- | :--- | | | Stores entire ZIP in RAM | Typically 128MB - 2GB | | Execution Timeout | Blocks until complete | 30-300 seconds (web servers) | | Disk Space | Uses temp files | /tmp fills up | | Central Directory | Must be written after all file data | Requires seekable storage |
The central directory is the key: a ZIP file’s table of contents is at the end of the file. Most libraries cannot stream it without first knowing all file sizes and CRCs. 4.1 Level 1: Streamed Passthrough (No Compression – "Store" Method) Best for: Already compressed files (JPEG, MP4, PDFs).
@shared_task(bind=True) def generate_large_zip(self, file_paths, job_id): temp_zip = f"/tmp/job_id.zip" with zipfile.ZipFile(temp_zip, 'w', zipfile.ZIP_DEFLATED, allowZip64=True) as zf: for path in file_paths: zf.write(path, os.path.basename(path)) # Upload to S3 s3.upload_file(temp_zip, "my-bucket", f"zips/job_id.zip") return f"https://my-bucket.s3.amazonaws.com/zips/job_id.zip" | Approach | Max ZIP size (practical) | Memory usage | HTTP timeout risk | Client experience | | :--- | :--- | :--- | :--- | :--- | | Naive (buffer) | < 200 MB | O(Size) | High | Immediate fail | | Streamed store | Unlimited* | < 20 MB | Medium (long download) | Progress bar works | | Chunked deflate | Unlimited* | < 100 MB | Medium | Same as above | | Async job | Unlimited (TB) | < 500 MB (worker) | None | Polling required | | Constraint | Naive Behavior | Failure Threshold
res.attachment('download.zip'); archive.pipe(res); // Direct HTTP response stream
(only per-file read buffer). Limitation: Output size ≈ sum of input sizes. Still fails if Content-Length cannot be precomputed. 4.2 Level 2: Chunked Deflate with CRC Precomputation Best for: Text files, logs, or data that needs compression but cannot fit in memory. const createWriteStream = require('fs')
Use ZIP’s "store" method (deflation level 0). The CRC and size are known per file before writing.
const createWriteStream = require('fs'); const archiver = require('archiver'); // Supports streaming const archive = archiver('zip', zlib: level: 0 , // Store, not compress forceLocalTime: true ); const archiver = require('archiver')
from zipstream import ZipStream import zlib zip_file = ZipStream(mode='w', compress_type=zlib.Z_DEFAULT_COMPRESSION) for file_path in huge_file_list: zip_file.add(file_path, arcname=os.path.basename(file_path)) Stream to HTTP response response = HttpResponse(zip_file, content_type='application/zip') response['Content-Disposition'] = 'attachment; filename="archive.zip"' return response