In 2018, a fintech startup we later helped rewrite their backend had an internal API where every endpoint returned HTTP 200, even on failure. Errors lived in a success: false JSON field. Three years and four client libraries later, every consumer had its own custom retry logic, half of them wrong, and one critical webhook handler silently dropped 12% of transactions for six months before anyone noticed.
That’s what bad API design costs. Not elegance points. Revenue.
Good API design isn’t about following a religion. It’s about making deliberate choices your team can live with three years from now, when you have five consumer applications, two mobile apps, and a partner integration that can’t be broken on a whim. This guide covers what actually matters: the patterns that hold up under load, the trade-offs worth making, and the mistakes we see most often in custom software development projects that inherit legacy APIs.
REST vs GraphQL: Stop Treating This as a Religion
The honest answer: use REST unless you have a specific reason not to.
REST is well-understood, cacheable at the HTTP layer, debuggable with curl, and supported by every tool and proxy you’ll touch. GraphQL solves real problems, over-fetching on mobile, coordinating dozens of micro-frontends, giving consumers query flexibility, but it introduces real costs: N+1 problems, harder caching, authorization complexity at the field level, and a learning curve for every new engineer.
Use REST when:
- You have a stable set of resources and operations
- Your consumers are mostly server-to-server or predictable clients
- You need HTTP caching, CDN integration, or aggressive rate limiting
- You want debuggability from the browser’s Network tab
Use GraphQL when:
- Mobile clients need to minimize payload size and round trips
- You have many front-end teams consuming the same backend
- Your data model is deeply nested and clients need arbitrary projections
- You’re building a public developer platform where query flexibility is a feature
We’ve built both. For most B2B products and internal platforms we build for clients, REST wins on operational simplicity. For consumer apps with aggressive mobile performance targets, GraphQL starts to pay for itself.
There’s no universally correct answer. There is a correct answer for your team, your consumers, and your operational maturity. Pick deliberately.
Resource Naming: Plural Nouns, No Verbs, Shallow Paths
Resource URLs are the shape of your API. If they’re inconsistent, every consumer pays for it forever.
The rules that hold up:
GET /v1/invoices # List
POST /v1/invoices # Create
GET /v1/invoices/inv_7A2k9 # Retrieve
PATCH /v1/invoices/inv_7A2k9 # Update
DELETE /v1/invoices/inv_7A2k9 # Delete
POST /v1/invoices/inv_7A2k9/void
Plural nouns for collections. Never verbs in paths. Actions that don’t fit CRUD get their own sub-resource endpoint with a POST. That last line, POST /v1/invoices/inv_7A2k9/void, is how Stripe’s API handles state transitions that don’t map cleanly to PATCH. Copy them. They’ve iterated on this for fifteen years and it works.
Keep nesting shallow. Two levels is the practical ceiling:
GET /v1/customers/cus_xyz/invoices # Fine
GET /v1/customers/cus_xyz/invoices/inv_abc/line_items # Too deep
The second one forces clients to track parent IDs they don’t care about. Flatten it to /v1/line_items?invoice=inv_abc and let query parameters do the filtering work.
Use opaque string IDs, not sequential integers. inv_7A2k9PQrL tells attackers nothing about volume. /invoices/14827 tells them exactly how many invoices exist and lets them scrape by incrementing the number.
HTTP Status Codes: Use the Ones the Spec Gave You
This is where most homegrown APIs go wrong. The HTTP spec already defined a precise vocabulary for what happened to a request. Use it.
The codes that matter in practice:
200 OK, Successful GET, PUT, PATCH, DELETE201 Created, Successful POST that created a resource, withLocationheader pointing to it204 No Content, Successful operation with nothing to return (DELETE often)400 Bad Request, Malformed request (bad JSON, wrong type)401 Unauthorized, Missing or invalid auth credentials403 Forbidden, Authenticated but not allowed to do this action404 Not Found, Resource doesn’t exist (or caller isn’t allowed to know it exists)409 Conflict, State conflict (duplicate email, concurrent edit)422 Unprocessable Entity, Validation error on a well-formed request429 Too Many Requests, Rate limit exceeded500 Internal Server Error, Something blew up on your side503 Service Unavailable, Down for maintenance or overloaded
The two most-confused codes: 401 means “I don’t know who you are.” 403 means “I know who you are, and you can’t do this.” Getting these right means debuggable production logs.
Return 422 for validation errors, not 400. 400 means the request itself is malformed at the HTTP level (invalid JSON, missing Content-Type). 422 means “I parsed it fine, but the field email isn’t a valid email.” Downstream retry logic behaves differently based on which one you return, so get it right.
Versioning: Pick a Strategy and Stick to It
Every API needs a versioning strategy before the first public consumer integrates. Retrofitting one costs 10x more than deciding upfront.
Three approaches that actually work:
1. URL path versioning (most common)
GET /v1/invoices
GET /v2/invoices
Simple, cacheable, visible in every log line. The downside: changing major versions means every consumer rewrites their URLs.
2. Header versioning (Stripe’s approach)
curl https://api.stripe.com/v1/charges \
-H "Stripe-Version: 2024-04-10"
Stripe pins API changes to a date. Consumers specify the version they built against, and that version keeps working forever. New accounts default to the latest version, but existing integrations don’t break on Stripe’s release schedule. This is the gold standard for public APIs, and it’s worth studying.
3. Content negotiation
Accept: application/vnd.mycompany.v2+json
Works but is harder to debug in logs and confuses CDNs. We rarely recommend it unless you have a strict hypermedia-driven architecture.
Whichever you pick, three non-negotiables:
- Never break an existing version silently. Breaking changes get a new version. Always.
- Deprecate publicly, with timelines. Tell consumers in response headers (
Sunset: 2027-01-01) six to twelve months before removing a version. - Never version endpoints individually.
v1/usersalongsidev3/ordersis chaos. Version the API, not the routes.
Versioning discipline matters even more when your API is consumed by multiple systems. Our system integration checklist before you scale covers the broader coordination problems that surface when APIs grow beyond a single consumer.
Additive changes, new fields, new endpoints, new optional parameters, never require a version bump. Breaking changes, renamed fields, removed endpoints, changed validation rules, always do.
Error Responses: Make Them Useful
The worst API error response in production, right now, somewhere: {"error": "something went wrong"} with a 500.
Good error responses give the consumer enough information to handle the error correctly without needing to open a support ticket.
A solid error envelope:
{
"error": {
"type": "validation_error",
"code": "invalid_email",
"message": "The email address format is not valid.",
"field": "user.email",
"doc_url": "https://docs.example.com/errors/invalid_email",
"request_id": "req_9Pq2kL8mNx"
}
}
Every field earns its place:
typegroups errors into classes the client can branch on (validation_error,auth_error,rate_limit_error)codeis the specific machine-readable identifiermessageis a human-readable explanation (never show this directly to end users, it’s for developers)fieldpoints at the specific input that failed, essential for form UIsdoc_urllinks to documentation with remediation stepsrequest_idis the log correlation ID, the single most valuable debug field you can include
Include the request_id on every response, success or failure. When a customer reports an issue, the first question from your support team should be “what was the request ID?” If you can’t pull the full request/response cycle from logs given that ID, fix your logging before shipping the API.
For validation errors with multiple problems, return them all at once:
{
"error": {
"type": "validation_error",
"message": "Multiple fields failed validation.",
"errors": [
{"field": "email", "code": "invalid_format"},
{"field": "password", "code": "too_short", "minimum": 8}
],
"request_id": "req_9Pq2kL8mNx"
}
}
One round trip, all errors. Client-side forms can render every field issue without multiple submits.
Pagination: Cursor, Not Offset
Offset-based pagination (?page=3&per_page=50) is the default everyone reaches for. It’s also wrong for any dataset that changes while being paginated.
The problem: if a new record gets inserted at position 5 while a client is fetching page 2 (positions 50-99), position 50 is now what was position 49 on the previous page. The client sees the same record twice and misses another one entirely. This gets worse at scale because databases evaluate OFFSET 10000 by counting through 10,000 rows.
Cursor pagination solves both:
GET /v1/invoices?limit=50&starting_after=inv_7A2k9PQrL
{
"data": [...],
"has_more": true,
"next_cursor": "inv_K3nQm7Xr"
}
The cursor is an opaque identifier the server understands. Clients only need to pass back what the server returned. New records don’t shift the window, and the database query is indexed on a single key instead of counting from zero.
The limit should be bounded (we usually cap at 100) and default to a sensible value (usually 20-50). Return has_more explicitly so clients don’t have to guess when they’ve reached the end.
For APIs that need both, give users total counts on dashboards but use cursors for iteration, offer a separate /count endpoint. Don’t compute counts on every paginated response.
Rate Limiting: Communicate It Clearly
Every public API needs rate limiting. Not because you’re stingy, but because one buggy client shouldn’t be able to take down your service for everyone else.
Three pieces of information every rate-limited response must return:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1712900400
When a client exceeds the limit, return 429 Too Many Requests with a Retry-After header telling them exactly when to try again:
HTTP/1.1 429 Too Many Requests
Retry-After: 32
Content-Type: application/json
{
"error": {
"type": "rate_limit_error",
"message": "Rate limit exceeded. Retry after 32 seconds.",
"request_id": "req_9Pq2kL8mNx"
}
}
Let consumers build proper retry logic with exponential backoff. If they have to guess, they’ll hammer you with retries and make the problem worse.
Two tiers usually work: a per-second burst limit (say, 25 requests per second) and a per-minute sustained limit (say, 1000 per minute). Legitimate consumers stay well under both. Misbehaving ones hit the burst limit first and get pushed back before they can do damage.
Authentication: Match the Method to the Use Case
Three auth methods, three use cases. Pick based on the consumer.
API keys (server-to-server)
Authorization: Bearer sk_live_51L...
Long-lived secrets for server-side integrations. Scope them as narrowly as the use case allows, read-only keys for analytics, write keys for provisioning, separate keys per environment. Rotatable. Revocable. Never exposed to browsers or mobile apps.
JWT (stateless user auth)
Authorization: Bearer eyJhbGciOiJSUzI1...
Short-lived (15 minutes to 1 hour) access tokens carrying user identity and scope claims. Paired with refresh tokens for session continuity. Signed by your auth service, verified by your API without a database hit. Good for: high-traffic user-facing APIs where auth checks per request are a bottleneck.
The common JWT mistakes: making them too long-lived (an hour maximum for access tokens), storing sensitive data in the payload (it’s base64, not encrypted), or rolling your own signing (use a library, RS256 or ES256, never HS256 with a weak secret).
OAuth 2.0 (third-party integrations)
When external applications need to access user data on their behalf. OAuth 2.0 with PKCE for public clients, authorization code flow for confidential clients. Never implement the implicit flow, it’s deprecated for good reasons.
Scopes matter. read:invoices and write:invoices aren’t just documentation, they’re a contract you enforce on every request. If a token has read:invoices, requesting a POST to /invoices returns 403 Forbidden. Granular scopes let users grant minimal permissions and let your audit logs answer “who did what with which credential.”
For most SaaS products we help clients build, the combination is API keys for server integrations, JWTs for the web and mobile app, and OAuth 2.0 for partner and third-party apps. See our system integration services page for how we approach multi-tenant auth in production systems.
Documentation: OpenAPI or It Doesn’t Exist
An API without documentation is a private API, no matter what you call it. An API with documentation that doesn’t match the current implementation is worse, it actively misleads.
The only reliable approach: generate documentation from an OpenAPI specification, and let the specification be the source of truth.
paths:
/v1/invoices/{id}:
get:
summary: Retrieve an invoice
parameters:
- name: id
in: path
required: true
schema:
type: string
example: inv_7A2k9PQrL
responses:
'200':
description: Invoice retrieved
content:
application/json:
schema:
$ref: '#/components/schemas/Invoice'
OpenAPI gives you three things for free:
- Interactive documentation (Swagger UI, Redoc, Stoplight)
- Client SDKs in a dozen languages generated automatically
- Request/response validation at the gateway or middleware layer
GitHub publishes their full REST API as OpenAPI. Stripe generates their SDKs and docs from internal specifications. Every serious public API does this. Yours should too.
Beyond the spec, document:
- Getting started, curl a real endpoint and see a real response in under five minutes
- Authentication, step by step, with sample code in at least three languages
- Error reference, every
codeyou return, what it means, how to fix it - Changelog, dated, with breaking changes called out in red
- Idempotency, how it works, when to use it, example headers
Performance Patterns: The Three That Move the Needle
Idempotency keys
For any non-idempotent operation (POST, typically), accept an Idempotency-Key header. If a client retries the same operation with the same key, return the original response instead of creating a duplicate:
POST /v1/charges
Idempotency-Key: chrg_retry_4f8a2k9p
This is how Stripe prevents double-charges on network timeouts. The server stores the first response against the key for 24 hours. Retries return the cached response. Critical for payment APIs, critical for anything that touches money or triggers side effects.
Partial responses with sparse fieldsets
Let clients specify which fields they need:
GET /v1/customers/cus_xyz?fields=id,email,created
Cuts response size on mobile, reduces database work on the server, and costs almost nothing to implement. JSON:API formalizes this with the fields[resource]=... syntax. JSON:API spec is worth reading even if you don’t adopt the full standard.
HTTP caching with ETags
For resources that don’t change often, return an ETag header. Clients send it back with If-None-Match on subsequent requests, and you return 304 Not Modified with an empty body if nothing changed:
GET /v1/invoices/inv_7A2k9
→ 200 OK, ETag: "abc123"
GET /v1/invoices/inv_7A2k9
If-None-Match: "abc123"
→ 304 Not Modified
Free bandwidth savings for consumers, lower database load for you. CDNs can honor this automatically.
Common API Design Mistakes
Every API review we do as part of a legacy modernization engagement surfaces the same recurring patterns. If you’re doing any of these, stop:
Returning 200 on errors. HTTP has 60+ status codes for a reason. Use them.
Verbs in URLs. /getUsers, /deleteInvoice. The HTTP method is the verb. The URL is the noun.
Nested resources five levels deep. If you wrote /organizations/:org/teams/:team/members/:user/permissions/:perm, something went wrong in your data model.
Snake_case in some places, camelCase in others. Pick one per language convention (snake_case for Python/Ruby, camelCase for JavaScript) and enforce it with a linter.
Dates without timezones. Every timestamp in every response should be ISO 8601 with UTC: 2026-04-13T14:32:07Z. No exceptions. Not “2026-04-13 14:32:07” (what timezone?). Not epoch (unreadable in logs).
Passwords, tokens, or PII in query parameters. They end up in server access logs, browser history, and proxy caches. Always in request bodies or headers.
Forgetting to paginate. A list endpoint without pagination works fine until the list hits 10,000 records and your API starts returning 30MB responses.
No request IDs in responses. When things break, you’ll be debugging blind.
Breaking changes in a minor version. Consumers trust your version contract. Violate it once and you’ve taught them never to trust it again.
No rate limiting. One buggy client loop will take you down.
Putting It Together
The APIs that last a decade, Stripe, GitHub, Twilio, AWS, share a pattern. They’re opinionated about the small stuff (naming, status codes, pagination), generous with information (request IDs, detailed errors, documented scopes), and ruthless about versioning discipline (additive changes forever, breaking changes never without a new version).
Good API design isn’t about being clever. It’s about making your API boring in exactly the right places, predictable status codes, consistent naming, reliable error shapes, so the interesting work happens at the application level, not in translating between your API’s idiosyncrasies and what consumers actually need.
We build APIs that support production mobile apps, partner integrations, and internal systems as part of our web application development and SaaS development practices. If you’re designing a new API or inheriting one that’s making your team miserable, we can help you decide what to fix, what to version, and what to leave alone. Schedule a free 30-minute consultation and bring a sample of your current API surface. We’ll give you a concrete assessment, not a sales pitch.
You can also explore our full services or read more on our approach to the custom software development process.