Building a Resumable Upload Flow with SharpUploader

Building a Resumable Upload Flow with SharpUploader

Resumable uploads improve user experience by allowing large or interrupted file transfers to continue from where they left off. SharpUploader is a fictional (or third-party) uploader library focused on reliability and performance. This article shows a complete, practical approach to implementing a resumable upload flow with SharpUploader in a web application, using a front-end browser client and a simple back-end API. Examples use JavaScript/TypeScript and Node.js, but the patterns translate to other stacks.

Why resumable uploads matter

  • Reliability: Network interruptions or client crashes won’t force users to restart large uploads.
  • Bandwidth efficiency: Only missing chunks are retried, saving time and data.
  • User experience: Progress persists and uploads complete even after transient failures.

Overview of the approach

  1. Split files into fixed-size chunks (e.g., 5–10 MB).
  2. For each chunk, compute a checksum (e.g., SHA-256) to detect corruption and avoid duplicate uploads.
  3. Maintain an upload session on the server that tracks received chunk indices.
  4. Use SharpUploader to handle chunked transmission, pause/resume, retries with backoff, and parallel chunk uploads.
  5. On resume, query the server for already-received chunks and upload only the missing ones.
  6. After all chunks are uploaded, request the server to assemble them into the final file.

Client-side: chunking and upload state

Chunking logic (browser)

  • Choose chunk size: 5 MB is a good default; use 1–10 MB depending on latency and memory.
  • Derive chunk count: Math.ceil(file.size / chunkSize).
  • For each chunk: file.slice(start, end) to create a Blob.

Example: chunk generator (TypeScript)

ts

functiongenerateChunks(file: File, chunkSize = 5 * 1024 * 1024) { let offset = 0; let index = 0; while (offset < file.size) {

const end = Math.min(offset + chunkSize, file.size); yield { index, blob: file.slice(offset, end), start: offset, end }; offset = end; index++; 

} }

Checksums

  • Compute SHA-256 per chunk to validate integrity and identify duplicates.
  • Use Web Crypto API in the browser:

ts

async function sha256(blob: Blob) { const arrayBuffer = await blob.arrayBuffer(); const hashBuffer = await crypto.subtle.digest(‘SHA-256’, arrayBuffer); return Array.from(new Uint8Array(hashBuffer)).map(b => b.toString(16).padStart(2, ‘0’)).join(“); }

Server-side: session tracking and chunk storage

API endpoints

  • POST /uploads/init — create an upload session; returns uploadId, chunkSize, expectedChunks.
  • GET /uploads/:uploadId/status — returns list/bitmap of received chunk indices.
  • PUT /uploads/:uploadId/chunks/:index — upload a chunk (body = chunk bytes + headers: checksum).
  • POST /uploads/:uploadId/complete — assemble chunks, verify overall checksum, finalize.

Session data model (example)

  • uploadId: string
  • fileName, fileSize, chunkSize, totalChunks
  • received: bitset or set of indices
  • createdAt, expiresAt

Storing chunks

  • Store chunks in temporary object storage (e.g., S3 multipart parts, or filesystem temp folder) keyed by uploadId + index.
  • Validate checksum on each received chunk; mark chunk as received only after validation.

Using SharpUploader: client integration

Assuming SharpUploader exposes a high-level API for resumable chunked uploads with hooks for chunk creation, checksum, and status checks.

Initialization flow

  1. Client calls POST /uploads/init with file metadata; server returns uploadId and chunkSize.
  2. SharpUploader creates chunk queue using either server chunkSize or a default.

Basic pseudo-usage

ts

const uploader = new SharpUploader(file, { chunkSize: serverChunkSize, parallel: 3, computeChecksum: async (chunk) => await sha256(chunk.blob), onProgress: (progress) => { /* update UI / }, onError: (err) => { / show retry UI */ }, });

await uploader.init(uploadId); // optionally inform uploader of server session

Resume logic

  • On start/resume, call GET /uploads/:uploadId/status to get received chunk indices.
  • Feed missing indices to SharpUploader so it only enqueues those chunks:

ts

const status = await fetch(/uploads/${uploadId}/status).then(r => r.json()); const missing = allIndices.filter(i => !status.received.includes(i)); uploader.enqueueChunks(missing); uploader.start();

Automatic retries and backoff

  • Configure SharpUploader to retry chunk uploads with exponential backoff (e.g., max 5 attempts).
  • For idempotency, include uploadId and chunk index in the PUT endpoint and check checksum server-side to ignore duplicate uploads.

Server: assembling final file

  • On POST /uploads/:uploadId/complete:
    • Verify all chunks received.
    • Option A: Stream-append chunks into final file (filesystem) — efficient memory usage.
    • Option B: Use object storage multipart-complete APIs to instruct S3 to assemble parts.
    • Compute final file checksum and compare with client-provided overall checksum (optional).
    • Move final file to permanent storage and delete temporary chunks.
    • Mark session complete and return final file URL or metadata.

Handling edge cases

  • Partial session cleanup: expire sessions after a configurable TTL (e.g., 24–72 hours); use background cleanup job.
  • Concurrent clients: only allow one active assembly operation; multiple clients can upload chunks but the server must enforce locks on assembly.
  • Chunk corruption: reject mismatched checksum, allow client to re-upload chunk.
  • Authentication & authorization: tie upload sessions to user accounts or use signed upload tokens to prevent unauthorized access.
  • Large number of chunks: store received bitmap efficiently (bitset or compressed list) and paginate status responses.

UI/UX considerations

  • Show per-chunk and overall progress.
  • Allow pause/resume buttons; persist uploadId and progress in localStorage to survive browser restarts.
  • Provide clear retry/error messages with estimated retry times.
  • Optionally support background uploads using Service Worker + Background Sync for mobile reliability.

Example end-to-end sequence (summary)

  1. Client POST /uploads/init -> server returns uploadId, chunkSize, totalChunks.
  2. Client computes per-chunk checksums and GET /uploads/:uploadId/status to fetch received chunks.
  3. Use SharpUploader to upload missing chunks in parallel, with retries and checksums.
  4. After upload finishes, POST /uploads/:uploadId/complete to assemble file.
  5. Server verifies, assembles, stores final file, returns URL.

Performance tips

  • Tune parallel uploads: 3–6 parallel chunk uploads balances throughput and network contention.
  • Adjust chunk size: larger reduces overhead but increases retry cost; 5–10 MB is typical.
  • Use HTTP/2 where available to reduce connection overhead.
  • Offload checksum verification to a streaming hash process if possible (avoid loading full chunk into memory twice).

Security recommendations

  • Require authenticated requests or one-time signed upload tokens.
  • Validate file metadata server-side (size, type limits).
  • Scan final files for malware via antivirus or sandboxing if files are user-uploaded.
  • Rate-limit initiation endpoints to prevent resource abuse.

Conclusion

A robust resumable upload flow with SharpUploader combines client-side chunking and checksum verification, server-side session tracking and chunk validation, and clear resume logic that queries the server for received chunks. With proper session lifecycle management, retries, and user-friendly UI, resumable uploads become reliable and efficient for large files and unreliable networks.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *