How diff algorithms work
Every diff view on this page is produced by the Myers algorithm — a 1986 technique by Eugene W. Myers that finds the shortest edit script between two token sequences in O((N+M)D) time, where D is the edit distance. The algorithm is built around the Longest Common Subsequence problem, and the engine runs entirely in your browser using the open-source jsdiff library.
- Tokenise the inputs — Before comparing, the algorithm splits each input into a sequence of tokens. Line granularity splits on newlines; word granularity splits on whitespace and punctuation boundaries; character granularity treats every Unicode code point as its own token.
- Build the edit graph — The Myers algorithm models the comparison as a path through a 2D grid where moving right means “delete from original”, moving down means “insert from changed”, and moving diagonally means “token matches in both”. The algorithm finds the shortest diagonal-heavy path.
- Extract the LCS — The diagonal moves in the shortest path trace the Longest Common Subsequence — the tokens that appear in both inputs in the same relative order. Every token in the LCS is “unchanged”; everything else is either an addition or a deletion.
- Apply preprocessing options — If you enable “Ignore case”, both inputs are lowercased before the LCS pass so “HELLO” and “hello” count as identical. “Ignore whitespace” collapses multiple spaces into one. “Trim each line” strips leading and trailing whitespace per line before comparison.
- Render the selected view — The output is the same LCS result displayed three ways: Side-by-side shows original on the left and changed on the right in a two-column grid with red and green row highlights. Unified shows a single column with − and + prefix lines, like the output of
git diff. Inline shows deletions as red strikethrough and additions as green underline within the same text flow. - Compute the summary strip — After rendering, the tool counts how many tokens were added, removed, and unchanged, then calculates similarity as the ratio of unchanged tokens to the larger of the two input lengths. A similarity of 100% means the inputs are identical after preprocessing.
Why use a diff checker
- Code review without a Git client — Paste two versions of a config file, a SQL migration, or a shell script and see what changed without cloning a repo, switching branches, or waiting on a CI pipeline. The tool is handy for quick reviews during pair programming, for contractor handoffs where the other side hasn’t shared their Git history, and for legacy codebases that pre-date version control entirely. The unified view produces output you can copy straight into a chat thread or a ticket.
- Contract and document redlines — Word-level diff shows which terms shifted between contract drafts faster than Word’s Track Changes panel. Paste clause A from the first draft and clause B from the executed copy and the substitution lights up red-on-green at the exact phrase that moved. Paralegals and procurement teams use this to verify last-minute redlines didn’t sneak past review before a contract gets signed.
- Essay and draft revisions — Writers comparing a first draft against an edited version can flip to word granularity to see every substitution, insertion, and cut without rereading both copies. The same workflow works for translators auditing changes against the source text, for editors checking that a copy edit preserved the author’s voice, and for journalism teams reconciling a published article against the filed draft.
- Log and config comparison — Sysadmins comparing two server config snapshots, two cron schedules, or two
ps auxoutputs can use line granularity to locate the single changed parameter in a 200-line file in seconds. Pair it with the Ignore-whitespace option and a noisy alignment-only diff collapses to the parameter changes that actually matter.
Common applications
Text diff shows up at the end of every edit cycle in writing, development, and operations work.
- Pull request review: paste two function implementations side by side to understand the logic change before approving, without the overhead of checking out the branch.
- Internationalisation QA: compare an English source string against its translated equivalent at word level to detect insertions, omissions, or terminology swaps the translator may have introduced.
- Incident analysis: diff two Kubernetes manifest snapshots or two “docker inspect” outputs at line level to isolate the configuration change that preceded an outage.
A worked example
Take a five-line server config. Original: host=localhost, port=5432, dbname=app_db, user=app, password=secret. Changed: host=db.prod.example.com, port=5432, dbname=app_db, user=app_prod, password=secret. With line granularity and Side-by-side view, line 1 shows red on the left (host=localhost) and green on the right (host=db.prod.example.com), line 4 shows red (user=app) and green (user=app_prod), and lines 2, 3, and 5 stay unchanged on both sides. The summary strip reports 2 additions, 2 deletions, 3 unchanged, and a similarity of 60% — three of five lines kept. Switch to word granularity and the diff tightens further: only the values to the right of = on lines 1 and 4 light up, the keys stay unchanged, and the similarity climbs to about 85% because the LCS now counts host, user, and the surrounding punctuation as kept.
Does this run in my browser?
Yes. The entire diff computation runs client-side using the open-source jsdiff library loaded with the page. Nothing you type, paste, or compare is sent to any server. You can verify it yourself: open browser DevTools, switch to the Network tab, clear the log, click Compare, and confirm that zero network requests fire for the comparison step.
What does the similarity percentage mean?
Similarity is calculated as unchanged tokens / max(total tokens in original, total tokens in changed). A score of 100% means the two inputs are identical after applying your preprocessing options (case folding, whitespace collapsing, line trimming). A score of 0% means no token is shared between the inputs. The metric is a rough approximation of edit distance — useful as a quick gauge — not a plagiarism or originality score.
Can I diff JSON / YAML / XML semantically?
Not in this tool. This is a text-level diff, so whitespace-only reformatting of JSON or XML still shows many changes even when the data is logically identical. Reordering object keys in JSON also shows as changes even though most parsers treat key order as insignificant. For a true semantic diff that compares parsed object trees and ignores key order and formatting, we’re planning a dedicated JSON Diff tool. For now, normalise both inputs to the same indentation and key order before pasting them here.
How do unified vs side-by-side views differ?
Side-by-side renders two columns: the original on the left and the changed version on the right, with removed lines highlighted red on the left and added lines highlighted green on the right. Unchanged lines appear in both columns aligned at the same row. Unified renders a single column with a − prefix and red background for removed lines and a + prefix and green background for added lines — the same layout git diff prints to your terminal. Use unified when you want to copy the result as a patch file or paste it into a code review thread. Use side-by-side when the visual alignment of what replaced what matters more than the raw patch text.
Paste the original on the left, the changed version on the right, pick a view and a granularity, and the comparison appears in milliseconds. Switch on Live mode and the diff reruns on every keystroke as you edit either side. Download the result as a standard unified .patch file that git apply consumes directly. No upload, no account, no vendor API key, no quota.