DermDx — software-only AI skin-cancer diagnosis, FDA-pending
Built a software-only AI for melanoma diagnosis exceeding the best-dermatologist baseline, then took it through FDA submission — the first of its kind. Paired with MoleSafe, the imaging-workflow redesign that took total-body photography from 2 hours to 20 minutes and surfaced DermDx’s 7-second per-lesion diagnosis at clinic scale across 3.5M patients in 40+ countries.
- Sensitivity
- 97%
- Specificity
- 67%
- Patients on platform
- 3.5M
- Imaging workflow
- 2hr → 20min
Problem
There was no FDA-cleared software-only AI for melanoma diagnosis. Existing approved devices (e.g. DermaSensor) were hardware-based and traded away specificity to chase sensitivity — 97% sensitivity but only 25% specificity, which in practice means the device flags nearly every benign lesion as cancer. Useful as a never-miss screen; useless as a triage tool. Dermatologists were the gold standard at roughly 95% sensitivity / 60% specificity, but they don’t scale to 3.5M patients across 40+ countries — the platform we already operated.
The strategic question wasn’t "can we build a better model?" — it was: can we run a multi-year regulatory and clinical-data strategy in parallel with shipping a global imaging platform, without slowing either down? — and can we land it inside complex multi-actor clinical workflows (above) without forcing the hospital to redo its own plumbing?
Approach
I designed the work as two parallel tracks — a clinical + regulatory submission for DermDx, and continuous improvements to the underlying DermEngine imaging platform — sharing the same data, the same labeling pipeline, and the same engineering team. The model itself is a two-CNN + ViT ensemble with the transformer catching cases the CNNs miss; inference cost is roughly $0.60 per case at 97% sensitivity / 67% specificity. The DermaSensor approval became a predicate device for our 510(k) — equal-or-better performance against an approved device puts the FDA timeline at 90 days, not three years.
- Anchor the clinical bar above dermatologists.
Set the model target above the best-dermatologist baseline (95% sensitivity / 60% specificity). Anything less wasn’t worth the regulatory cost. The FDA submission is a multi-year commitment — only worth making if the model genuinely beats the human gold standard.
- Use the global platform as the data flywheel.
DermEngine already served 5,000 providers in 40+ countries — 17M images, +60% YoY image growth. The labeled dataset wasn’t the bottleneck — clinical-grade labeling was. WHO-Africa work + MoleSafe US data filled the missing skin-types-V/VI and US-resident gaps the FDA review would otherwise have rejected on.
- Field work first, in Australia.
Travelled to Australian clinics — the world’s highest melanoma incidence per capita — to map the existing imaging workflow before proposing architectural changes. The redesign came out of a stopwatch and a notepad, not a PowerPoint.
- Run regulatory and product in lockstep.
FDA submission docs, clinical validation studies, and product roadmap shared the same source of truth so a model improvement could be reflected in submission packets within days — not quarters.
- Ship the platform improvements anyway.
While the FDA path ran, the platform doubled imaging throughput — patient imaging time fell from 2 hours to 20 minutes — which fed back into a better dataset for the AI work. Regulatory wasn’t allowed to block product velocity.
In regulated AI, owning the data and the workflow is a more durable moat than owning the model. The model gets caught up by the open-source frontier within a year. The data pipeline and the clinical relationships don’t.
What shipped
- DermDx model — 97% / 67%.
97% sensitivity / 67% specificity on the validation cohort — exceeds the best-dermatologist baseline on both axes (95% / 60%).
- FDA 510(k) submission.
Complete and pending. First software-only AI for skin-cancer diagnosis to be submitted; DermaSensor used as predicate device.
- Imaging workflow — 2hr → 20min.
7-second per-lesion diagnosis loop. Capacity unlock made the rollout pace possible.
- Platform-wide impact.
+40% DAU, +30% consults YoY, NPS rose to 70+ (from 50+ baseline) on the underlying DermEngine product.
- National integration playbook.
The IslaCare UK plan above became the template for landing DermDx + DermEngine inside national-scale clinical networks without the hospital re-doing its own imaging plumbing.
- Recognition.
1st place in a dermatology-focused hackathon during the build — concrete external validation while the FDA path was still pending.
Sub-workflows in production
The OSF integration above didn’t ship as one monolithic flow; it composed of separate, individually deployable workflows that each landed against the same platform. Three worth featuring:
AI Assist triage. Patient submits images, the platform runs a real-time quality check on capture, then DermDx categorizes the case — dermoscopic-and-cancer cases route to the Skin Cancer Specialist, clinical-and-general-dermatology cases route to the General Specialist, clinical-and-cancer cases close out automatically. The dermatologist’s queue is therefore pre-prioritized rather than first-in-first-out — the AI’s value lands in routing, before review even starts.
Patient ↔ specialist chat. Patient questions land in a nurse queue. Nurse triages: if the answer is clear they respond directly; if not, they tag a specialist in Spot Notes and either request a follow-up image through the SkinApp or escalate to a dermatologist. The chat surface uses the same case object as the diagnostic flow, so a question and an image submission are never separate threads — they share the patient timeline.
Payment + insurance verification. Cash patients go through Stripe in-app and land in the dermatologist queue immediately. Insurance patients upload the front and back of their insurance card with the imaging submission; the case parks in a pending state until an administrator verifies the card + ID, then it’s released. Rejected cases trigger an in-app notification with options. The financial layer is a workflow concern, not a UI concern — bolted into the same case-object pipeline.
Workflow deep-dive: MoleSafe
DermDx’s data flywheel didn’t turn until the imaging workflow on top of it actually scaled. MoleSafe was the program that fixed that: a workflow rebuild for the largest US dermatology network (seven clinics across multiple states), a deferred customer for years on a DSLR-camera predecessor (MoleMap) that charged $800 per 2-hour session and still missed lesions. Their objection to switching was specifically image quality. We benchmarked the predecessor by stopwatch in an Australian sister-clinic, reproduced it in-office with a mannequin, and rebuilt the workflow with iPhone 15 Pro Max capture (48MP, prime-lens, 1.78 aperture) + 3D-printed magnetic attachments + cross-polarized LED rig.
- For nurses — AI capture loop.
Image-quality check, lesion detection, lesion matching, dermoscopic-ring detection — all wired into the capture moment. Quality is enforced at the point of capture, not in QA later.
- SmartSnap auto-detection.
The operator doesn’t aim — they capture. The AI identifies and frames the lesion automatically; for follow-up visits, the model already knows the lesion and attaches the image to the right record.
- Follow Me cross-device sync.
Mobile capture syncs to web and Apple TV in real time. Dermatologist review happens on a different device while imaging continues — no waiting, no batching, no synchronous handoff.
- For dermatologists — diagnostic compression.
Image grouping per lesion (overview + magnified dermoscopic), AI overlays for new/missed/evolving lesions via change-detection, keyboard shortcuts to submit. Per-lesion decision loop drops to 7 seconds.
- Telemedicine dashboard, two time zones.
Reviewers in the Rocky Mountains cover a dozen clinics in real time without ever seeing a patient — the dashboard surfaces full-body images with all follow-up overlays and AI-assisted navigation.
Outcome on the program: 2 hours → 20 minutes per patient, 7 seconds per-lesion diagnosis, 3M+ images migrated through the cutover, clinic rollout 1 → 5 per month, pricing tiers restructured to $800 / $500 / $250 as time savings unlocked smaller-clinic accessibility. The 5 AI models running across the workflow — lesion detection, lesion matching, image-quality check, dermoscopic-ring detection, flicker for change-detection — produced the same labeled-data flywheel that fed DermDx’s FDA submission training set, on the same platform, with the same engineering team.
Customer obsession is field work. You can’t redesign a clinical workflow from a Figma file — you redesign it with a stopwatch, a mannequin, and a willingness to feel every wasted second yourself.
The MoleSafe program is the operational half of the DermDx story: the AI model needed a workflow that could surface it at clinic scale, and the workflow needed an AI model worth surfacing. Building both in parallel — same data, same labeling pipeline, same engineering team — is what made the FDA submission a consequence of platform progress instead of a competing priority.
Outcome
Clinical bar
Beats dermatologists
97% sensitivity / 67% specificity vs the 95% / 60% best-dermatologist baseline. Concrete numbers for peer-reviewed publication and FDA reviewers.
Regulatory
First software-only submission
First software-only AI for skin-cancer diagnosis through the FDA process, with DermaSensor as predicate. A category position dermatologists, payers, and platform operators were actively looking for.
Platform growth
+40% DAU · NPS 70+
Underlying DermEngine drove +40% DAU, +30% consults YoY, NPS rose to 70+ (from 50+ baseline) across the multi-year work. The FDA narrative was an unlock, not a distraction.
Throughput
2hr → 20min imaging
Patient imaging time collapsed by 80%. 7-second per-lesion diagnosis. Clinic capacity unlock funded the next round of platform investment.
What I’d do differently
- Invest in synthetic data earlier. We were conservative with synthetic augmentation; I now think we left specificity points on the table.
- Engage FDA pre-sub conversations sooner. A 30-minute pre-submission meeting at month two would have saved a quarter of rework later.
- Treat the labeling vendor as a product, not a vendor. Annotator throughput, agreement, and bias were tracked as ops metrics — they should have been tracked as product metrics with a roadmap.
Related work







