Building a Resumable Upload Flow with SharpUploader
Resumable uploads improve user experience by allowing large or interrupted file transfers to continue from where they left off. SharpUploader is a fictional (or third-party) uploader library focused on reliability and performance. This article shows a complete, practical approach to implementing a resumable upload flow with SharpUploader in a web application, using a front-end browser client and a simple back-end API. Examples use JavaScript/TypeScript and Node.js, but the patterns translate to other stacks.
Why resumable uploads matter
- Reliability: Network interruptions or client crashes won’t force users to restart large uploads.
- Bandwidth efficiency: Only missing chunks are retried, saving time and data.
- User experience: Progress persists and uploads complete even after transient failures.
Overview of the approach
- Split files into fixed-size chunks (e.g., 5–10 MB).
- For each chunk, compute a checksum (e.g., SHA-256) to detect corruption and avoid duplicate uploads.
- Maintain an upload session on the server that tracks received chunk indices.
- Use SharpUploader to handle chunked transmission, pause/resume, retries with backoff, and parallel chunk uploads.
- On resume, query the server for already-received chunks and upload only the missing ones.
- After all chunks are uploaded, request the server to assemble them into the final file.
Client-side: chunking and upload state
Chunking logic (browser)
- Choose chunk size: 5 MB is a good default; use 1–10 MB depending on latency and memory.
- Derive chunk count: Math.ceil(file.size / chunkSize).
- For each chunk: file.slice(start, end) to create a Blob.
Example: chunk generator (TypeScript)
ts
functiongenerateChunks(file: File, chunkSize = 5 * 1024 * 1024) { let offset = 0; let index = 0; while (offset < file.size) {const end = Math.min(offset + chunkSize, file.size); yield { index, blob: file.slice(offset, end), start: offset, end }; offset = end; index++;} }
Checksums
- Compute SHA-256 per chunk to validate integrity and identify duplicates.
- Use Web Crypto API in the browser:
ts
async function sha256(blob: Blob) { const arrayBuffer = await blob.arrayBuffer(); const hashBuffer = await crypto.subtle.digest(‘SHA-256’, arrayBuffer); return Array.from(new Uint8Array(hashBuffer)).map(b => b.toString(16).padStart(2, ‘0’)).join(“); }
Server-side: session tracking and chunk storage
API endpoints
- POST /uploads/init — create an upload session; returns uploadId, chunkSize, expectedChunks.
- GET /uploads/:uploadId/status — returns list/bitmap of received chunk indices.
- PUT /uploads/:uploadId/chunks/:index — upload a chunk (body = chunk bytes + headers: checksum).
- POST /uploads/:uploadId/complete — assemble chunks, verify overall checksum, finalize.
Session data model (example)
- uploadId: string
- fileName, fileSize, chunkSize, totalChunks
- received: bitset or set of indices
- createdAt, expiresAt
Storing chunks
- Store chunks in temporary object storage (e.g., S3 multipart parts, or filesystem temp folder) keyed by uploadId + index.
- Validate checksum on each received chunk; mark chunk as received only after validation.
Using SharpUploader: client integration
Assuming SharpUploader exposes a high-level API for resumable chunked uploads with hooks for chunk creation, checksum, and status checks.
Initialization flow
- Client calls POST /uploads/init with file metadata; server returns uploadId and chunkSize.
- SharpUploader creates chunk queue using either server chunkSize or a default.
Basic pseudo-usage
ts
const uploader = new SharpUploader(file, { chunkSize: serverChunkSize, parallel: 3, computeChecksum: async (chunk) => await sha256(chunk.blob), onProgress: (progress) => { /* update UI / }, onError: (err) => { / show retry UI */ }, });await uploader.init(uploadId); // optionally inform uploader of server session
Resume logic
- On start/resume, call GET /uploads/:uploadId/status to get received chunk indices.
- Feed missing indices to SharpUploader so it only enqueues those chunks:
ts
const status = await fetch(/uploads/${uploadId}/status).then(r => r.json()); const missing = allIndices.filter(i => !status.received.includes(i)); uploader.enqueueChunks(missing); uploader.start();
Automatic retries and backoff
- Configure SharpUploader to retry chunk uploads with exponential backoff (e.g., max 5 attempts).
- For idempotency, include uploadId and chunk index in the PUT endpoint and check checksum server-side to ignore duplicate uploads.
Server: assembling final file
- On POST /uploads/:uploadId/complete:
- Verify all chunks received.
- Option A: Stream-append chunks into final file (filesystem) — efficient memory usage.
- Option B: Use object storage multipart-complete APIs to instruct S3 to assemble parts.
- Compute final file checksum and compare with client-provided overall checksum (optional).
- Move final file to permanent storage and delete temporary chunks.
- Mark session complete and return final file URL or metadata.
Handling edge cases
- Partial session cleanup: expire sessions after a configurable TTL (e.g., 24–72 hours); use background cleanup job.
- Concurrent clients: only allow one active assembly operation; multiple clients can upload chunks but the server must enforce locks on assembly.
- Chunk corruption: reject mismatched checksum, allow client to re-upload chunk.
- Authentication & authorization: tie upload sessions to user accounts or use signed upload tokens to prevent unauthorized access.
- Large number of chunks: store received bitmap efficiently (bitset or compressed list) and paginate status responses.
UI/UX considerations
- Show per-chunk and overall progress.
- Allow pause/resume buttons; persist uploadId and progress in localStorage to survive browser restarts.
- Provide clear retry/error messages with estimated retry times.
- Optionally support background uploads using Service Worker + Background Sync for mobile reliability.
Example end-to-end sequence (summary)
- Client POST /uploads/init -> server returns uploadId, chunkSize, totalChunks.
- Client computes per-chunk checksums and GET /uploads/:uploadId/status to fetch received chunks.
- Use SharpUploader to upload missing chunks in parallel, with retries and checksums.
- After upload finishes, POST /uploads/:uploadId/complete to assemble file.
- Server verifies, assembles, stores final file, returns URL.
Performance tips
- Tune parallel uploads: 3–6 parallel chunk uploads balances throughput and network contention.
- Adjust chunk size: larger reduces overhead but increases retry cost; 5–10 MB is typical.
- Use HTTP/2 where available to reduce connection overhead.
- Offload checksum verification to a streaming hash process if possible (avoid loading full chunk into memory twice).
Security recommendations
- Require authenticated requests or one-time signed upload tokens.
- Validate file metadata server-side (size, type limits).
- Scan final files for malware via antivirus or sandboxing if files are user-uploaded.
- Rate-limit initiation endpoints to prevent resource abuse.
Conclusion
A robust resumable upload flow with SharpUploader combines client-side chunking and checksum verification, server-side session tracking and chunk validation, and clear resume logic that queries the server for received chunks. With proper session lifecycle management, retries, and user-friendly UI, resumable uploads become reliable and efficient for large files and unreliable networks.
Leave a Reply