This handbook is the foundation for everyone joining Shipgate. It explains what we are building, why it matters, who it serves, and how it works. Engineering, marketing, sales, and design should all start here before anything else.
Shipgate is an AI-native safety layer that sits between a developer's code change and the production codebase. It understands the entire repository, evaluates risk, detects security issues, identifies low-quality AI-generated code, and stops dangerous changes from shipping.
Why the way we review code today is fundamentally broken
For most of software history, code review was a human-scale problem. A developer writes a change, a colleague reads it, and together they decide whether it is safe to ship. This works when teams are small and the pace of changes is manageable. It breaks down fast when both of those things stop being true.
AI coding tools have fundamentally changed the equation. Developers using tools like Cursor, GitHub Copilot, or Lovable can now produce ten times more code than before. That sounds like a win. The problem is that the reviewers are still human. The volume of code going out has exploded, but the human capacity to carefully evaluate it has not moved at all.
Meanwhile, the tools that are supposed to catch problems were built for a different era. They check for syntax errors (whether your code is grammatically correct) and style violations (whether you used the right formatting). They do not understand what a change actually does to a running system, how far its effects reach, or whether the AI that generated it made up a function that does not actually exist anywhere.
AI tools mean developers can propose code changes far faster than before. More PRs per day, more files changed per PR, and more surface area for mistakes to hide. Human reviewers simply cannot keep up with the pace. Something has to give, and it is usually the thoroughness of the review.
PR = Pull Request. This is the package of code changes a developer proposes to add to the shared codebase. Think of it as a formal request to merge your edits into the main project.Most reviewers, human or AI, spend their attention on formatting, variable naming, and minor style preferences. The real question is whether this change could break something important or open a security vulnerability. That question rarely gets a serious answer, because answering it requires understanding the entire codebase, not just the few lines that changed.
Traditional security scanners are typically run after code is already merged into the main branch and deployed to production, meaning it is live on servers that real users are hitting. By the time they find a problem, the vulnerability is already in the wild. Fixing it then requires pulling engineers away from other work, writing a patch, going through the entire release process again, and hoping no one found the hole in the meantime. Catching the same issue at review time costs a tiny fraction of that.
Merged = approved and permanently added to the shared codebase. Deployed = shipped to live production servers that real users interact with.AI coding assistants produce code that looks correct on the surface but frequently is not. They confidently invent the names of libraries and functions that do not exist anywhere. They copy patterns from their training data without understanding whether those patterns fit the specific codebase they are writing for. They introduce subtle logic errors that every syntax checker will happily pass. And no existing review tool was built to detect any of this, because they all predate the era of AI-generated code.
AI slop = code produced by AI tools that looks plausible but contains hidden errors, invented references to things that do not exist, or patterns copied without understanding the context they are being pasted into.A two-line change to an authentication function can silently break ten other parts of the system that depend on it. A reviewer looking only at the changed lines has no way to know that without a comprehensive map of the entire codebase. No human reviewer consistently holds that map in their head for every file in every project they review.
Authentication = the part of a software system responsible for verifying who a user is before letting them in. Vulnerabilities here are among the most serious in any application.CTOs and team leads have no consistent, data-driven view of how risky their incoming code changes are, which contributors are introducing the most problems, or whether quality is trending up or down across their repositories over time. Every decision they make is based on instinct and individual memory rather than reliable, systematic data.
There is no fast, PR-level system that answers the question engineering teams actually need answered: "If this change is wrong, what breaks, how far does the damage spread, and how risky is it to ship right now?" That is exactly what Shipgate answers, automatically, on every single pull request.
What Shipgate does and why the approach is different
Shipgate installs on your GitHub repository and becomes a required check on every pull request. When a developer opens a PR, Shipgate runs automatically. It analyses the proposed change against the full context of the codebase, measures how far the change's effects reach, scans for security vulnerabilities, checks for the telltale signs of low-quality AI-generated code, and produces a clear risk verdict before any human reviewer has even opened the diff.
The critical framing here is risk engine. Shipgate is not a linter (a tool that enforces formatting rules). It is not a spellchecker for code. It exists to answer one question: is this change safe to ship? Everything it does flows from that single goal.
The result for reviewers is a structured risk summary that replaces 40 minutes of uncertain manual analysis with 5 minutes of confident, data-backed decision-making. They see exactly what the change does, how far it reaches, what security issues were found, and a clear recommendation on whether to block or allow the merge.
Most review tools only read the lines you changed. Shipgate first reads and maps the entire repository so it understands what those changed lines affect across the whole system. It knows how every file relates to every other file before it evaluates anything.
Before you merge, Shipgate maps exactly how far the change ripples outward: which other files depend on it, which services are affected, and whether any critical parts of the system sit in the blast zone. This is the core differentiator no competitor offers.
Runs security checks on every PR against the OWASP Top 10, which is the global standard checklist of the ten most critical and commonly exploited security risks in software. Catches SQL injection, exposed passwords, vulnerable dependencies, and more before they reach users.
Identifies the distinctive failure patterns of AI-generated code: invented library names that do not exist, meaningless placeholder variables, excessive copy-pasted boilerplate, and code that is structurally inconsistent with how the rest of the codebase was written. No other product does this.
Integrates as a required check on GitHub. When a PR crosses a risk threshold, the merge button is physically blocked until the issues are resolved. Security becomes a non-negotiable gate, not a guideline that gets skipped when deadlines tighten.
How Shipgate works from first pull request to final verdict
Shipgate is installed as a GitHub App. A GitHub App is a piece of software that you connect to your repository in a single click, the same way you might install a browser extension. Once installed, it automatically receives a notification every time a developer opens, updates, or closes a pull request on that repository. No developer needs to change their workflow. No new commands to run. Shipgate operates silently in the background and posts its findings directly onto the PR.
Here is the complete step-by-step of what happens from the moment a PR is opened to the moment a reviewer sees the results.
A developer proposes a code change by opening a pull request on GitHub. This action immediately triggers a webhook notification to Shipgate. A webhook is an automatic notification sent by GitHub the instant a specific event occurs, without anyone having to press a button or run a command. Think of it as a doorbell that rings automatically when a package arrives.
Shipgate reads the full codebase and builds an internal map of how everything connects. It identifies which files import (use code from) which other files, parses the code structure using a technique called tree-sitter (which reads code the same way a compiler does, understanding the logical structure of the code as a tree of relationships rather than just as raw text), and assembles a dependency graph showing how every part of the system is connected to every other part.
This map is kept continuously updated, not rebuilt from scratch every time. When new code is pushed, only the changed portions of the map are refreshed, which is what keeps analysis fast even on large codebases.
Shipgate traces the full downstream impact of the proposed changes. It identifies every file, function, and service that is directly or indirectly affected by what the PR modifies. It classifies the change by type: feature addition, refactor, security patch, configuration change, or dependency update. It specifically flags if any sensitive areas of the system are in the blast zone, including authentication (the login and access control system), payment processing, database schema (the structure of how data is stored), or public-facing API endpoints (the connection points that other systems or users call from outside).
All of this feeds into a blast radius risk level of Low, Medium, High, or Critical. This level is used as a multiplier when computing the Security Score.
Three categories of security checks run simultaneously on the changed code. First, static analysis using a tool called Semgrep scans the code for patterns that match the OWASP Top 10, the globally accepted list of the ten most critical and most commonly exploited security vulnerabilities in web software. This covers SQL injection (tricking a database with malicious commands), cross-site scripting (injecting code that runs in another user's browser), broken authentication (flaws in the login and session management system), and seven other equally serious categories.
Second, secrets detection scans every line for accidentally committed credentials: API keys (passwords used by software to access external services), database connection strings (the address and password for connecting to a database), and private keys (cryptographic credentials that, if exposed, give an attacker the ability to impersonate the server).
Third, dependency scanning checks every new or updated package in the project against OSV, an open-source vulnerability database maintained by Google and continuously updated with newly discovered security problems in popular libraries.
Shipgate compares the code in the PR against the established conventions, style, and patterns of the rest of the repository. Several specific checks run in parallel: hallucinated import detection cross-references every library and function name in the changed code against public package registries to verify they actually exist. Placeholder detection flags variables with meaningless names like "data", "temp", "result", or "handler" that suggest the code was generated rather than written with intent. Style deviation analysis flags sections of code that are structurally inconsistent with how the rest of the project was written.
Shipgate combines three inputs into a single Security Score from 0 to 100. The raw findings from the security engine provide the base. The exposure level of the affected code adjusts this: a vulnerability on a public-facing payment page is weighted much more heavily than the same vulnerability in an internal admin utility that only three people access. Finally, the blast radius multiplier scales the score based on how widely the vulnerable code is used across the system. A Security Score above the team's configured threshold triggers an automatic block on the merge.
Shipgate posts a structured summary comment directly on the pull request. The reviewer sees the Impact Risk level, the Security Score, every finding with the precise file name and line number, a plain-English explanation of why the finding matters, and a suggested code fix formatted as a GitHub suggestion block (meaning the contributor can apply the fix with a single click without leaving the browser). The GitHub check status updates simultaneously, which is what physically controls whether the merge button is enabled or blocked.
This is what reviewers see on every PR:
Every major capability, explained without jargon
Shipgate has six core capabilities that work together as a unified risk system. Each one addresses a specific gap that no other tool in the market adequately fills. They are not independent features bolted together. They compound: blast radius makes security scoring more precise, slop detection improves review accuracy, and contributor intelligence makes the whole system smarter over time.
Shipgate does not just read the lines you changed. Before reviewing any PR, it builds a complete internal model of the repository: every file, every function, every dependency relationship, every framework in use, and every sensitive area of the codebase including authentication, payment flows, admin routes, and database access patterns.
It uses a parsing technique called tree-sitter, which reads code the same way a compiler does. A compiler is the software that translates code from human-readable form into instructions a computer can execute. Tree-sitter reads the logical structure of the code as a tree of relationships rather than treating it as raw text. This means Shipgate understands what code does, not just what it says. The model is kept current through incremental updates every time new code is pushed, so it always reflects the live state of the project.
Every code change creates a ripple effect through the system. A change to a shared utility function might be used in 40 other places across the codebase. A change to the database schema (the structure that defines how data is stored and organised) might affect every query in every service that reads from or writes to that database. Blast Radius is Shipgate's measurement of how wide that ripple spreads.
Shipgate counts the downstream files, services, and public interfaces that are affected by what the PR modifies. It produces a risk level of Low, Medium, High, or Critical. This level is then used as a multiplier in the Security Score calculation, because the same vulnerability in a widely-used, critical component is orders of magnitude more dangerous than the same vulnerability sitting in an isolated corner of the codebase that very few things depend on.
Three parallel security checks run on every PR. Static analysis via Semgrep scans the changed code against the OWASP Top 10. OWASP stands for the Open Web Application Security Project, a nonprofit that publishes the definitive list of the ten most critical and most commonly exploited security risks in software. These include SQL injection (manipulating a database by embedding commands in user input), broken authentication (flaws in how users log in and stay logged in), cross-site scripting (injecting code that runs in other users' browsers), and seven other equally serious categories that account for the vast majority of real-world security breaches.
Secrets detection scans every line for accidentally committed credentials. API keys are passwords that software uses to call external services. A leaked Stripe API key means an attacker can charge your customers. A leaked AWS key means they can spin up servers on your bill. A leaked database password means they can read and delete your data directly. This happens constantly, and most teams only discover it after the damage is done.
Dependency scanning checks every new or updated library added to the project against OSV, a continuously maintained database of known vulnerabilities in open-source code. Using a library with a known vulnerability is one of the most common ways production systems get compromised.
AI coding assistants produce a set of distinctive failure patterns that traditional review tools were never designed to detect, because those tools all predate the era of AI-generated code. Hallucinated import detection cross-references every library and function name in the PR against public package registries, the official directories of available software packages, to verify they actually exist. Code that references a non-existent library will compile and pass every syntax check, then crash the moment it runs, which is the worst possible time to discover the problem.
Placeholder variable detection flags code using meaningless names like "data", "result", "temp", or "handler" that suggest the code was generated to look functional rather than written to solve a specific problem. Style deviation analysis compares the structure and patterns of the new code against the established conventions of the rest of the repository, flagging sections that appear inconsistent with how the project was written by humans over time.
Shipgate integrates as a required status check on GitHub. A required status check is a specific GitHub feature that physically prevents the merge button from activating until a designated check passes. This is not a warning. This is not a comment that can be dismissed. This is the merge button being greyed out and non-functional until the issues are resolved.
Teams configure the thresholds for what triggers a block. A hardcoded secret always blocks, with no exceptions. A Security Score above 60 always blocks. A Critical blast radius combined with any unresolved High-severity finding always blocks. Everything below those thresholds passes with a clear summary for the reviewer to act on using their judgment.
Shipgate builds a running quality profile for every contributor to a repository. It tracks slop rate (the percentage of that contributor's PRs that contain AI-generated code quality issues), security issue introduction rate (how frequently their changes introduce vulnerabilities), rework rate (how often a PR requires multiple rounds of revision before it can merge), and review-to-merge time. This data gives engineering leaders something they have never had before: reliable, systematic evidence for decisions about who to trust with sensitive parts of the codebase and where to invest review attention.
For enterprise customers, this data can be formatted into exportable security posture reports. These reports are structured specifically to be submitted during vendor procurement reviews, SOC 2 audits (a formal certification process that large companies require before buying software), ISO 27001 assessments (another internationally recognised security certification framework), and PCI-DSS compliance checks (the security standard required for any software that handles credit card payments).
The combination is the product. Any individual capability above exists in some form somewhere in the market. The thing no competitor offers is all of them operating together as a unified risk engine, with AI slop detection as the connective thread that makes all the others more accurate. AI-generated code is harder to review, more likely to contain subtle errors, and more likely to slip past existing security scanners precisely because it was not written with intent. Shipgate is the only platform designed from the ground up for this reality.
The scale and shape of the opportunity we are building into
Shipgate operates at the intersection of two large, fast-growing markets: developer tooling and application security testing. The shift to AI-assisted development is creating a new sub-category within both markets that did not meaningfully exist three years ago. The companies that define that sub-category now will own it for a long time. That is the window Shipgate is building into.
The most important market dynamic is timing. AI coding tools crossed mainstream developer adoption in 2023. The trust gap, the moment where teams realise they are routinely shipping AI-generated code they cannot properly evaluate, is becoming acutely painful in 2025 and 2026. The tools built to address that gap are being defined right now. Shipgate is among the first to build specifically for this problem.
The paying customers are not individual developers. They are the engineering teams, platform teams, and organisations responsible for the quality and security of the codebases those developers contribute to. A single enterprise customer deploying Shipgate across 50 repositories is worth dramatically more than 50 individual subscribers. The business model targets the team and organisation tier from day one.
| Capability | SonarQube / CodeClimate | CodeRabbit | cubic.dev | Shipgate |
|---|---|---|---|---|
| Static security analysis (code scanning) | Yes | Yes, via integrations | Partial | Yes, OWASP-aligned |
| Dependency vulnerability scanning | Yes | Yes | Partial | Yes, via OSV database |
| Secrets and credential detection | Some | Yes | Partial | Yes |
| Blast Radius / system-wide impact analysis | No | No | No | Yes, core feature |
| AI slop and hallucination detection | No | No | No | Yes, core feature |
| Native merge blocking enforcement | Via CI pipeline only | No | No | Yes, native |
| Multi-platform (GitHub, GitLab, Bitbucket) | Yes | Yes | GitHub only | Yes |
| Vibe coding platform integrations (Lovable, Bolt, Cursor) | No | No | No | Yes, purpose-built |
| Enterprise security reports for compliance and procurement | Limited | No | No | Yes, exportable PDF |
On the competition: CodeRabbit raised $88M at a $550M valuation in September 2025 and is growing at 20% month-over-month. This confirms the market exists and that budgets are allocated for this category. It does not mean the category is decided. Independent benchmarks score CodeRabbit 1 out of 5 on review completeness. In January 2025, CodeRabbit's own AI system flagged a malicious instruction in a PR comment and then executed it, exposing over one million repositories to potential attack. cubic.dev is GitHub-only, has three employees, and no independently verified accuracy data. The gap Shipgate targets is real, unoccupied, and validated by the capital already flowing into the space.
The timing argument for building Shipgate today and not two years from now
There have always been code review tools. There have always been security scanners. The question worth asking clearly is why a new entrant focused specifically on AI-generated code risk has a meaningful window right now. The answer is a convergence of three things happening simultaneously: the pain is established and real, the budget is allocated in engineering and security teams, and the standards for what a solution looks like are still being written by the market. When all three are true at the same time, that is the window.
GitHub Copilot launches publicly. ChatGPT arrives in November. For the first time in history, AI-generated code becomes a realistic part of everyday developer workflow at meaningful scale. The tools are exciting. Engineers begin using them for the productivity gains. The specific failure modes that emerge at scale are not yet understood, and nobody is thinking about what happens when AI-generated code starts reaching production in volume.
Cursor, Lovable, Bolt, and Replit launch and gain significant traction. These vibe coding platforms lower the barrier to contributing code so far that non-engineers begin pushing code directly to repositories. AI-assisted development goes from a productivity trick used by experienced engineers to the default workflow for developers at every skill level. The proportion of AI-generated code in production codebases grows rapidly, and most engineering teams have no reliable way to measure it or review it differently.
Engineering teams begin experiencing the failure modes at scale. AI-generated code introduces subtle bugs that pass code review. Security incidents linked to AI-generated vulnerabilities start appearing in the industry. In January 2025, CodeRabbit, the largest AI code review platform, is found to have a critical vulnerability where its own AI flagged a malicious instruction embedded in a PR review comment and then executed it, exposing over one million connected repositories. CodeRabbit raises $88M at a $550M valuation in September 2025. The market is actively looking for a more serious solution and capital is available to buy it.
This is the window. The pain is established. Engineering budgets for developer tooling and security tooling are approved. The buying criteria for what "AI-aware code review" actually means are still being written by customers making their first purchasing decisions. The companies that build the right solution during this window will set the standard that everyone else has to compete against for the next five years. First-mover advantage in a category that is still forming is one of the most durable positions in enterprise software.
Lovable, Bolt, Cursor, and Replit will account for a large and growing share of new code contributions to both private and open source repositories within 18 months. Any review platform that lacks specific detection capabilities for what these tools produce will be operating blind on its fastest-growing input stream. Shipgate's integrations with vibe coding platforms, built now while the market is forming, will become table stakes that competitors will scramble to build from a trailing position.
The people who buy Shipgate and the problems they are trying to solve
Shipgate is bought by engineering teams and organisations, not individual developers. The person who installs it is rarely the person writing the code. They are the person responsible for what happens when that code reaches production. This distinction is fundamental to how we design the product, how we write about it, and which features we build first.
Every feature decision at Shipgate should be evaluated through the lens of the person paying the invoice. That person owns a codebase they need to protect. They manage a team they need to keep safe. They have a security posture to maintain in front of customers and auditors. They are not looking for a smarter autocomplete tool. They are looking for a system that catches what humans miss, at scale, without requiring their team to slow down.
Responsible for everything the team ships. Developers are using AI tools and moving fast. Manual reviews are getting harder to do thoroughly because the volume is high and AI-generated code is structurally harder to evaluate than code a human wrote with specific intent. They need a system they can trust to catch what the team misses, and they need clear visibility into where risk is accumulating across their repositories before something goes wrong.
Maintains a public repository with dozens or hundreds of external contributors they do not know personally. Receiving more PRs than ever before, many generated by AI tools with widely varying quality and intent. No budget to hire dedicated reviewers. Needs automation that is smart enough to separate the small number of PRs requiring genuine attention from the large number that are safe to approve quickly, and to block the ones that are clearly not safe to merge at all.
Responsible for the security posture of engineering output across the entire organisation. Currently running security scans after deployment, which means finding vulnerabilities in code that is already live on production servers being accessed by real users. Looking for a way to shift security left, which is the industry term for moving security checks earlier in the development process, to the PR review stage rather than after code has shipped.
Running a small team using every AI tool available to ship features at maximum speed. Understands the team is accumulating technical debt and security risk but cannot afford to slow down for thorough manual review on every change. Needs a safety net that reliably catches the genuinely dangerous issues without adding friction to the parts of the development process that are working well and moving fast.
Who Shipgate is not for: Individual developers looking for a personal productivity tool or AI autocomplete. Teams whose primary pain is formatting inconsistency or style enforcement rather than security and risk. There are many tools for those use cases. Shipgate exists to protect codebases at the organisational level, for the people who own and are responsible for what ships, not the people who submit the changes.
The values that shape every decision, feature, and tradeoff
These are not aspirational values on a poster. They are active constraints. When a feature decision is unclear or two priorities conflict, these principles are what resolve the disagreement. Every person on the Shipgate team should be able to cite them and apply them to their own work.
If PR analysis takes longer than 10 seconds on changed files, developers route around it. Speed is not a feature. It is the precondition for everything else we do. We run analysis only on changed files. We parallelise every check that can be parallelised. We cache the repository index aggressively. We never sacrifice speed for thoroughness because we are committed to achieving both together.
A tool that raises too many false alarms gets turned off quickly. Every finding Shipgate surfaces must be a genuine, confirmed issue with a clear explanation of why it matters and a suggested path to fixing it. A false positive rate above 10 percent is a product failure, not an acceptable tradeoff. We would rather surface 5 real issues than 50 uncertain ones.
Every finding Shipgate raises includes what the problem is, why it matters in plain language a non-security expert can act on, and exactly how to fix it with a code suggestion attached. Surfacing a problem without a path to resolution is not a helpful feature. It is a source of frustration that erodes trust and leads reviewers to dismiss findings. We do not ship findings without fixes.
A security check that can be ignored when a sprint deadline arrives is not a security check. It is advice. Shipgate integrates as a required status check and blocks the merge button. This is intentional and non-negotiable. Organisations pay Shipgate for enforcement. The value disappears the moment it becomes optional.
Every other product in this space was designed before AI-generated code was a meaningful proportion of what ships. We are not retrofitting slop detection onto a legacy architecture. The reality that a significant portion of submitted code was generated by an AI tool is the starting assumption of every design decision we make. This is not a feature we added. It is the lens through which the entire product was conceived.
We build for the people who own and protect codebases, not the people who contribute to them. Features that reduce maintainer burden ship first. Features that only benefit contributors ship when they also make maintainers' lives easier. When any decision is unclear, ask the question: does this make the person responsible for the codebase more effective at their job?
A note for every new team member: This handbook will evolve as the product evolves and the market evolves. What will not change is the foundational insight that created Shipgate: AI coding tools are making it far easier to push code that looks correct but is not, and nobody has built a serious, purpose-built system for catching that at the pull request layer before it reaches production. Everything Shipgate does flows from solving that problem completely, for the people whose job it is to ensure what ships is safe. Welcome to the team.