Skip to waitlist
← Back to PrivDNA

PrivDNA: Biological Data Sovereignty Through Air-Gapped Whole Genome Sequencing

Technical Whitepaper


Prepared by: PrivDNA Domain: privdna.com Location: New York City, New York Date: March 2026


"Your genome is the most personal data you will ever generate. It cannot be changed, cannot be revoked, and cannot be anonymized. It deserves infrastructure built from first principles around that reality."


TABLE OF CONTENTS

  1. Executive Summary
  2. The Problem: Genetic Data Exploitation
  3. The Solution: PrivDNA
  4. Market Analysis
  5. Technical Architecture
  6. Facility Design and Customer Experience
  7. Regulatory and Compliance Framework
  8. Operational Playbook
  9. Referral Partnership Model
  10. Appendix A: Glossary
  11. Appendix B: Regulatory Timeline
  12. Appendix C: References and Sources

I. EXECUTIVE SUMMARY

Status: Pre-launch (Phase 0). PrivDNA is in the design and capital-raise phase. No laboratory is operational, no certifications have been issued, and no customers have been sequenced. This whitepaper describes the planned architecture, business model, and regulatory path. All operational language below should be read as forward-looking.

PrivDNA is a privacy-sovereign whole genome sequencing (WGS) service in development, planned to operate from a physical storefront in New York City. Once operational, PrivDNA will sequence customers' complete genomes at clinical-grade accuracy (≥90% of bases above Q30), process all data on air-gapped servers that never touch the internet, deliver results on FIPS 140-3 certified encrypted hardware, and destroy all on-premise copies under NIST SP 800-88 Rev. 2 standards. Customers will visit the glass-walled laboratory twice -- once to hand over their sample, and again to receive their encrypted drive and witness the on-premise data being destroyed.

The core promise: an unbroken chain of custody from your sample to your hands. Your genome will be collected in person, processed on air-gapped hardware, handed to you on an encrypted drive, and destroyed in your presence. No copies. No cloud. No exceptions.

Why Now

The consumer genomics industry is in crisis. On March 23, 2025, 23andMe filed for Chapter 11 bankruptcy; on July 14, 2025, its approximately 15 million customers' genetic data was acquired by TTAM Research Institute for $305 million through a legal structure that bypassed re-consent requirements. A Nebula Genomics class-action lawsuit alleges the now-defunct "privacy-first" company shared genetic data with Meta, Google, and Microsoft via embedded tracking tools (ProPhase Labs, Nebula's parent, filed Chapter 11 for its lab subsidiaries in September 2025). Consumer trust in genetic testing services has collapsed at the exact moment when whole genome sequencing costs have fallen below $250 per genome at the laboratory level.

This confluence creates a market opening for a fundamentally different model -- one built on physical transparency, cryptographic verifiability, and zero retention of genomic data (operational records retained per CLIA 42 CFR 493.1105).

The Business


II. THE PROBLEM: GENETIC DATA EXPLOITATION

2.1 The Unique Nature of Genomic Data

Genomic data occupies a singular position in the hierarchy of personal information. Unlike a password, it cannot be changed. Unlike a Social Security number, it cannot be reissued. Unlike financial records, it does not expire or become irrelevant with time. A genome sequenced today will be re-analyzable with increasing precision for the lifetime of the individual and, by extension, their biological relatives.

This permanence creates an asymmetric risk profile that existing data protection frameworks were not designed to address. A breach of genomic data is irrevocable -- there is no "credit monitoring" equivalent for DNA.

2.2 The 23andMe Collapse: A Case Study in Systemic Failure

The trajectory of 23andMe illustrates the structural vulnerability of the centralized genomics model:

Timeline of Collapse:

The collapse from a peak implied market capitalization of $6 billion to a $305 million asset sale is significant not only as a financial event but as a structural demonstration: when a company holds centralized genetic data, that data becomes an asset in liquidation proceedings, subject to sale without meaningful individual consent.

Multiple state attorneys general urged consumers to delete their data before the sale closed. The incident prompted a 2025 article in Science examining the systemic fragility of consumer genetic privacy.

2.3 The Broader Pattern of Data Monetization

23andMe was not an anomaly. The centralized genetics business model is built on data monetization:

The pattern is consistent: companies that hold centralized genetic databases face irresistible economic pressure to monetize that data, regardless of initial privacy commitments.

2.4 Consumer Sentiment: Trust at Historic Lows

The erosion of trust is measurable:

The regulatory environment is responding. Multiple U.S. states enacted genetic privacy legislation in 2024-2025, including the Texas Genomic Act of 2025 and laws in Nebraska, Alabama, Montana, and Florida. In 2026 the pace accelerated: Utah HB 182 was signed into law on March 17, 2026 (effective January 1, 2028); South Dakota SB 49 was signed in late March 2026 ($5,000 per violation, effective July 1, 2026); Wisconsin AB 673 was vetoed on March 27, 2026; and West Virginia HB 5034 was introduced. Rhode Island and Vermont also have bills in progress. At the federal level, the Don't Sell My DNA Act has bipartisan sponsorship in both chambers (Reps. Lofgren D-CA and Cline R-VA, Sens. Cornyn R-TX, Klobuchar D-MN, and Grassley R-IA).

2.5 The Interpretation Gap

Despite this distrust, demand for genomic information continues to grow. As of early 2019, approximately 26 million people had undergone consumer genetic testing, primarily in the United States (MIT Technology Review), and the number has grown substantially since. The whole genome sequencing market was valued at $2.12 billion in 2024 and is projected to grow at a 22.17% CAGR, reaching $6.67 billion by 2030 (Grand View Research).

Consumers want access to their genomic data. They do not want the companies that generate it to keep copies.

This gap -- between demand for genomic information and distrust of genomic custodians -- is the market PrivDNA was designed to fill.


III. THE SOLUTION: PRIVDNA

3.1 The Core Model: A Secure Data Refinery

PrivDNA is not a medical clinic. It is not a diagnostics company. It is a secure data refinery -- a facility that takes a biological sample as input, produces structured genomic data as output, and retains nothing.

The business model rests on three pillars:

1. Physical Transparency The laboratory is visible through a floor-to-ceiling glass wall. At the intake visit, customers watch their sample being collected and barcoded at the lab bench, logged into chain of custody in real time. At the delivery visit four to six business days later, they watch their encrypted drive being prepared and the on-premise data being cryptographically destroyed. The 38-hour sequencing run and downstream processing happen between visits and are verifiable via the open-source pipeline, not by live observation. A dedicated Technical Representative -- the "Genomic Concierge" -- narrates the lab activities visible during each visit and walks customers through what happened between them.

2. Cryptographic Verifiability The entire bioinformatics pipeline is open source, published on GitHub under permissive licenses (MIT/BSD/Apache 2.0). Any customer, or any auditor acting on their behalf, can inspect the code to verify there are no telemetry endpoints, no cloud synchronization calls, and no data exfiltration channels. The pipeline is deterministic: given the same input, it produces the same output, verifiable via SHA-256 checksums.

3. Zero Data Retention Upon delivery, all on-premise copies of the customer's data will be destroyed under NIST SP 800-88 Rev. 2 Purge standards using cryptographic erasure on self-encrypting NVMe drives. The customer will receive a Certificate of Destruction documenting the media serial numbers, sanitization method, timestamp, and technician identity. The customer will witness the destruction process through the glass wall.

Zero retention starts at the waitlist. Even at the pre-launch stage, PrivDNA applies its zero-data-retention discipline to the waitlist itself. Email addresses are encrypted at rest with AES-256-GCM, hashed with HMAC-SHA256 for duplicate detection, and stored in a SQLCipher-encrypted database (scrypt KDF) on infrastructure PrivDNA controls. No third-party email marketing platform receives the addresses. The waitlist source code will be published as part of the open-source codebase.

3.2 What the Customer Receives

Each customer receives a FIPS 140-3 Level 3 certified encrypted USB drive (Kingston IronKey D500S or equivalent) containing:

File Format Approximate Size Description
Aligned reads BAM + BAI index ~80-100 GB Complete sequencing reads aligned to GRCh38 reference genome
Variant calls VCF ~1 GB All identified genetic variants (SNPs, indels)
Genomic VCF gVCF ~5-10 GB Comprehensive variant data including non-variant positions
Quality report HTML (MultiQC) ~50 MB Sequencing quality metrics, coverage statistics
Pipeline manifest JSON + SHA-256 <1 MB Exact software versions, parameters, and checksums for reproducibility
Certificate of Destruction PDF <1 MB Documented proof of on-premise data erasure

Default deliverables: BAM + VCF + gVCF are delivered by default. FASTQ files are not included in the standard delivery to conserve USB drive space, but can be regenerated from the BAM file if needed (using samtools fastq). Customers who specifically require FASTQ files may request them at the time of order.

Total delivery size: approximately 100-120 GB per genome.

What happens if you lose your keys?

By design, there is no recovery path. PrivDNA retains no copy of your sequence and no copy of your keys. Key loss is data loss. This is the cost of guaranteed non-retention -- and it is paid by anyone we cannot satisfy with any other architecture. Customers are encouraged to store the encrypted drive and key media in physically separate locations and to maintain their own offline backup if redundancy is desired.

3.2.1 What we retain, briefly

The customer's genome will be destroyed at Visit 2 under NIST SP 800-88 Rev. 2 (see §5.4 and §6.3). Operating a clinical laboratory, however, requires a small set of non-genomic operational records. To avoid ambiguity, those records are enumerated here.

PrivDNA's Laboratory Information Management System (LIMS) will maintain, for each customer sample:

These records are generated during the 4-6 business day processing window and retained after the customer receives their data only for the duration required by CLIA and CLEP record-keeping rules, then purged per documented SOPs reviewed during CLEP inspection and CAP accreditation.

Scheduling, billing, and customer contact information (name, email, appointment times) will live on a separate business system, encrypted at rest, and governed by the policy at privdna.com/privacy. That system will never hold genomic data.

None of the records retained above contain genomic sequence, variant calls, or any biological content derived from the customer's sample. They are the audit record that the sequence existed, was processed correctly, and was destroyed; they are not a copy of the sequence itself.

3.3 What PrivDNA Does Not Do

See §3.2.1 for the small set of operational records PrivDNA does retain during processing.

PrivDNA explicitly does not:

3.4 The Open Source Commitment

All custom software developed for PrivDNA's operations is published as open source:

This commitment serves dual purposes. First, it provides cryptographic assurance to customers -- they can verify that the code processing their DNA does exactly and only what it claims to do. Second, it builds community trust and positions PrivDNA as a public good contributor to the genomics ecosystem, a significant brand differentiator.

Publishing our pipeline eliminates the code itself as a competitive moat, deliberately. Our moat is the planned physical infrastructure, brand trust, regulatory certifications (once earned), and the NYC location -- assets that cannot be cloned from a GitHub repository. A competitor can fork the pipeline; they cannot fork a planned CLIA/CLEP-certified glass-walled laboratory with an established customer base and referral network.

The upstream pipeline components are already public open-source projects (nf-core/sarek, BWA-MEM2, GATK, DeepVariant). PrivDNA's own orchestration, chain-of-custody, and destruction-verification code will be published to github.com/danthi123/PrivDNA at launch (the existing repository at that URL contains the live waitlist site code and architectural specifications).

3.5 Response to Legal Compulsion (Subpoenas, Warrants, Court Orders)

PrivDNA's architecture limits what can be compelled by limiting what exists. Customer DNA will exist on PrivDNA infrastructure only during the 4-6 business day processing window between Visit 1 and Visit 2. Outside that window, there is nothing to disclose because nothing is retained.

Within the processing window, PrivDNA will:

The witnessed-destruction architecture means there is no historical archive to subpoena.

3.6 Why the Destruction Ceremony is Verifiable, Not Just Promised

The principal trust failure mode of any "we destroy your data" claim is the possibility of an undisclosed copy. PrivDNA addresses this with four overlapping mechanisms:

  1. Open-source pipeline -- every line of code that touches your sample will be publicly auditable. There is no proprietary path that could secretly write to a hidden destination.
  2. Air-gapped infrastructure -- the sequencing workstation has no network interface. There is no exfiltration channel a hidden process could use.
  3. SHA-256 deliverable verification -- the encrypted drives you receive are bit-for-bit reproducible from publicly documented inputs and pipeline versions, so there is no parallel "real" output retained elsewhere.
  4. Witnessed destruction with NIST SP 800-88 certificate -- the cryptographic erasure of working drives is observed by you in real time through a glass wall, and the certificate is signed and timestamped at Visit 2.

No single mechanism is sufficient on its own. The combination is intentional: SHA-256 verifies deliverable integrity, while the air-gap plus open-source pipeline is what prevents exfiltration. These are distinct guarantees working in concert.


IV. MARKET ANALYSIS

4.1 Total Addressable Market

Consumer Genomics (Global): $3.02 billion (2025), projected to reach $12.56 billion by 2032 at a 22.54% CAGR (360iResearch; cited figures from the 2025 edition, the live report is refreshed periodically).

Whole Genome Sequencing (Global): $2.12 billion (2024), projected to reach $6.67 billion by 2030 at a 22.17% CAGR (Grand View Research).

Direct-to-Consumer Genetic Testing (Global): $1.95 billion (2024), projected to reach $17.43 billion by 2034 at a 24.5% CAGR (Zion Market Research).

4.2 Serviceable Addressable Market

PrivDNA's initial SAM is defined by:

4.3 Serviceable Obtainable Market

Given PrivDNA's single-location capacity (~750 genomes/year at full utilization with realistic maintenance) and realistic market penetration for a new brand:

Target Customer Profiles

PrivDNA's addressable market is defined by four primary customer personas, validated through demographic analysis of the NYC metropolitan area (ACS 2024, Pew Research 2023, Deloitte 2025).

The Privacy Hawk. Age 35-50, household income $200,000-$500,000, graduate-educated professionals in technology security, law, or finance. This persona uses encrypted messaging, VPNs, and privacy-focused browsers -- and applies the same rigor to genomic data. The 23andMe bankruptcy was a catalytic event. They will audit PrivDNA's open-source pipeline on GitHub before booking an appointment. Price is not a significant barrier; the $3,500 is justified by conviction. Estimated addressable population in the NYC metro: approximately 2,600-5,200 annually (adults 35-50 in the top decile of privacy sensitivity, with household income above $200,000 and 5-10% annual WGS consideration). Expected referral rate: 2-4 per customer within 12 months.

The Health Optimizer. Age 30-50, household income $250,000-$1,000,000+, already spending $5,000-$25,000+ annually on concierge medicine, executive physicals, longevity clinics, and supplements. WGS is the foundational dataset for their health optimization stack -- pharmacogenomics, polygenic risk scores, carrier status, and preventive planning. Privacy is a value-add rather than the primary motivator. Discovery typically comes through a concierge physician, longevity podcast, or peer recommendation. Estimated addressable population: approximately 6,500-13,000 annually (10-20% consideration rate, higher than average due to active health optimization behavior). Expected referral rate: 3-5 per customer, with high family package conversion potential.

The Informed Parent. Age 28-42, household income $150,000-$400,000, dual-professional households planning pregnancy or with young children. Standard expanded carrier panels test 100-500 conditions via targeted sequencing; WGS captures the complete genome -- 6.4 billion base pairs per couple -- including rare and novel variants that panels miss. For couples investing $20,000-$50,000+ in IVF, adding $7,000 for the most comprehensive genetic picture is a rational decision. The privacy dimension is acute: GINA does not cover life, disability, or long-term care insurance, making a child's genome in a company database a quantifiable long-term risk. Estimated addressable population: approximately 3,900-7,800 annually (representing 2,000-4,000 purchasing units buying two genomes each). Expected referral rate: 3-5 couple referrals per unit within 12 months.

Genetic counseling advisory. PrivDNA does not provide genetic counseling, risk interpretation, or reproductive medical advice. Raw genomic data alone is insufficient for reproductive decision-making. Individuals considering using whole genome sequencing to inform pregnancy planning, carrier screening, IVF embryo decisions, or family medical history should consult a board-certified genetic counselor or clinical geneticist before making medical or family-planning choices. PrivDNA's referral packet (delivered at the customer's second visit) includes a curated list of independent counselors in the NYC metro and via telehealth; selecting and engaging a counselor is the customer's responsibility, and PrivDNA has no involvement in or visibility into any subsequent interpretation. See also the ACMG 2024 Points-to-Consider Statement on Polygenic Risk Scores for Embryo Selection.

The Tech Executive. Age 38-55, household income $500,000-$5,000,000+, CISOs, CTOs, managing directors, and general partners who apply enterprise-grade data hygiene standards to their personal information. This persona evaluates PrivDNA's FIPS 140-3 certification, air-gap architecture, and NIST destruction protocol with the same rigor they apply to vendor security audits. Price is immaterial at this income level; family packages of 2-4 genomes ($7,000-$14,000) are common. Estimated addressable population: approximately 6,000-12,000 annually (10-20% consideration rate reflecting above-average privacy awareness). Expected referral rate: 4-8 per customer -- the highest-leverage persona, operating in concentrated executive networks (YPO, board dinners, country clubs).

Demand funnel. Applying progressive filters to the NYC metro population: 20.1 million total population, narrowed to approximately 11.5 million adults aged 25-65, then to approximately 3.2 million with household income above $150,000, then to approximately 1.92 million who are privacy-sensitive (60%, Statista 2024), then to approximately 96,000 who are WGS-interested (5-10% of the privacy-sensitive segment aware of and considering WGS), then to approximately 14,400 willing to pay $3,500+ for privacy-sovereign WGS (15%), then to approximately 10,000 within practical travel distance of Manhattan (70%), yielding an estimated 200-500 annual purchasers at a 2-5% first-year conversion rate.

B2B channels. Two institutional channels are projected to grow from approximately 25% of Year 1 volume to 35% by Year 3. Concierge medicine referrals are the highest-leverage channel: an estimated 200+ concierge practices in the NYC metro area, each with patient panels of 50-600 individuals in the target demographic; 50 active referral practices producing an average of 12 genomes per year would yield 600 genomes annually -- exceeding the Year 1 target of 200 by 3x. Family offices (200+ single-family offices in the NYC metro, the highest concentration globally) represent a second high-value channel, with average engagements of 4-8 genomes per family at $12,800-$25,600 per engagement (at institutional pricing of $3,200/genome).

4.4 Competitive Landscape

The Market Has Two Tiers

Tier 1: SNP Genotyping (Low Resolution) 23andMe and Ancestry dominate this tier with SNP microarray tests at $99-$229. These tests examine 600,000-700,000 specific genetic markers -- less than 0.02% of the genome. They serve casual ancestry curiosity and basic trait analysis.

Tier 2: Whole Genome Sequencing (Full Resolution) WGS reads the complete 3.2 billion base pairs of the human genome. This tier is served by Sequencing.com ($399), Nucleus Genomics ($399 + $39/year membership), and Dante Labs ($599, frequently discounted to ~$499). Nebula Genomics shut down its consumer service on February 5, 2025 and faces a federal class-action privacy lawsuit (Portillo v. Nebula Genomics) alleging it shared genetic data with Meta, Google, and Microsoft via embedded tracking tools; its parent company ProPhase Labs filed Chapter 11 for its COVID-19 testing laboratory subsidiaries in September 2025. Dante Labs' UAE subsidiary (Dante Labs Genomics FZE) was reportedly acquired by Bio Cell Tech FZCO in February 2025. All active competitors operate via mail-order.

Competitive Comparison Matrix

Feature 23andMe Ancestry Nebula (defunct) Nucleus Dante Labs PrivDNA
Data type SNP array SNP array 30x WGS 30x WGS 30x WGS 30x WGS
Price $99-$229 $99-$119 $249-$299 (closed Feb 2025) $399 $599 $3,500
Physical storefront No No No No No Yes
Glass-walled lab No No No No No Yes
Air-gapped processing No No No No No Yes
Open source pipeline No No Partial No No Yes
Data retention Indefinite User-controlled Indefinite 60-day sample 10 years Zero
Witnessed destruction No No No No No Yes
Live technical representative No No No No No Yes
FIPS 140-3 encrypted delivery No No No No No Yes

Competitor Deep-Dive: Strengths and Weaknesses

The three active WGS competitors each present a distinct competitive profile -- and a distinct set of vulnerabilities.

Sequencing.com ($399, founded 2014, $5M Series A) is the most polished DTC WGS platform, with 688 Trustpilot reviews, an app marketplace of 100+ analysis tools, and turnaround options ranging from 10 weeks to 2-3 weeks (ultra rapid). Its primary weakness is a critical privacy gap: the "SequencingAI" feature shares customer genetic data with third-party AI services including OpenAI, which "may retain some data for an indefinite period of time" per the company's AI Use Policy -- directly contradicting its "Privacy Forever" branding. Recurring customer complaints center on auto-enrollment in a $30/month subscription plan without clear consent at purchase.

Nucleus Genomics ($399 + $39/year, founded 2020, $32M total raised including a $14M Series A in January 2025) reports aggressive month-over-month revenue growth [founder-cited, not independently verified] and has expanded into embryo selection ($5,999). However, the company faces serious credibility concerns: allegations of fabricating customer reviews, formal criticism from the American College of Medical Genetics and Genomics regarding polygenic-risk-score embryo products' evidence base, and a founder who has been compared to Elizabeth Holmes by critics. Independent research continues to find that polygenic scores -- central to the Nucleus Embryo product -- include significant statistical uncertainty and yield inconsistent predictions across methods (Namba et al., Nature Human Behaviour, 2024; see also Turley et al., NEJM, 2021). Physical DNA samples are destroyed within 60 days, but digital data retention periods are vaguely defined as "determined by business need."

Dante Labs ($599, frequently discounted to ~$499; its UAE subsidiary Dante Labs Genomics FZE was reportedly acquired by Bio Cell Tech FZCO in February 2025) holds an F rating from the Better Business Bureau with dozens of complaints logged in the prior 12 months (counts fluctuate). Customer reports consistently describe delivery failures (6-10+ months, with some customers never receiving results), non-responsive customer service, and hidden charges for raw data downloads. Labs operate in Italy and Dubai, not domestically. The 10-year data retention policy post-account-deletion and new UAE ownership raise data sovereignty concerns.

Lessons from failures. The DTC genomics sector has produced a consistent failure pattern. Nebula Genomics (shut down February 2025) marketed itself as "privacy-first" with blockchain-based data ownership, but a federal class-action lawsuit alleges it shared genetic data with Meta, Google, and Microsoft via embedded tracking pixels -- the ultimate privacy theater. Veritas Genetics ceased US operations in December 2019 after Chinese investors triggered CFIUS scrutiny; consumer volume remained low at $599 pricing, suggesting weak demand even at sub-$1,000 price points. Every low-price DTC WGS model has either failed (Nebula, Veritas), gone bankrupt (23andMe, Invitae), or pivoted entirely to B2B (Helix, Color Health).

Adjacent competition. Concierge medicine and executive health programs (Fountain Life at $10,500-$21,500/year, Human Longevity Inc. at $8,000 per assessment (recommended annually)) offer WGS as a bundled component of comprehensive diagnostic packages. This data is retained indefinitely in the clinical record. PrivDNA's $3,500 standalone service is priced below every concierge WGS alternative while offering superior privacy guarantees, positioning it as the affordable premium option for customers who want genomic data outside their medical record.

PrivDNA will sequence on the Element Biosciences AVITI, which uses avidity sequencing (sequencing-by-binding) rather than Illumina's sequencing-by-synthesis. The output quality is comparable (≥90% bases above Q30) and produces identical standard file formats (FASTQ, BAM, VCF), making it fully interchangeable for downstream analysis. Element's publicly advertised reagent price guarantee, if extended in the supply contract, would eliminate the single largest variable cost risk for the business.

The Whitespace

No existing competitor offers a physical, in-person genomics experience. The entire DTC genomics market operates via mail-order saliva kits. There is no "retail DNA testing storefront" in operation from any established player. This is distinct from WGS offered as a component of concierge medicine or executive health programs, where sequencing is typically outsourced to external labs and data is retained in the clinical record.

PrivDNA is positioning for a premium, physically transparent, cryptographically verifiable, zero-retention genomics service. The $3,500 price point is justified not by the sequencing itself (commodity) but by the infrastructure, trust architecture, and experience surrounding it.

4.5 The Privacy Premium

Quantitative evidence supports premium pricing for privacy:

The top three policies that increase consumer willingness to share genetic data are: (1) ability to request data deletion, (2) assurance data would not be sold or shared, and (3) specific permissions required for reuse. PrivDNA's model satisfies all three by design.

Pricing Context

The $3,500 price point is best understood not as a premium over commodity WGS, but as the lowest-cost entry in the premium health services category over any meaningful time horizon.

Price anchoring. Executive physicals at major academic medical centers cost $5,000-$11,000 annually (Mayo Clinic, Cleveland Clinic). Concierge medicine retainers average $2,500-$3,000 per year, with premium practices charging $10,000-$50,000 per year. Fountain Life's longevity membership costs $10,500-$21,500 annually; Human Longevity Inc.'s executive health program starts at $8,000 per assessment (recommended annually). Unlike every service in this comparison set, PrivDNA is a single, non-recurring expenditure that produces a lifetime dataset. Over two or more years, $3,500 is the cheapest option in a premium health portfolio.

The $399 vs. $3,500 reframing. These are not different prices for the same product; they are different products. At $399, a customer receives whole genome sequencing data that lives indefinitely on a company's servers, accessible to AI partners (Sequencing.com shares data with OpenAI per its AI Use Policy), third-party app developers, and future acquirers in bankruptcy proceedings. At $3,500, a customer receives the same sequencing data on an encrypted drive they physically possess, processed on an air-gapped server, delivered through a witnessed chain of custody, and destroyed on-premise under NIST SP 800-88 standards. The $3,100 difference is the cost of ensuring that the most sensitive data a person will ever generate exists in exactly one place: their hands.

Loss aversion. Kahneman and Tversky's prospect theory establishes that losses are psychologically approximately twice as powerful as equivalent gains, a finding replicated across 19 countries (Columbia University, 2020). Genomic data amplifies loss aversion through three properties: irreversibility (a compromised genome cannot be changed), familial scope (one person's data partially reveals the genomes of every biological relative), and temporal scope (future analytical capabilities will extract information from today's data that cannot currently be predicted). The loss of genomic privacy is permanent, generational, and expanding.

Pricing. PrivDNA's price is $3,500 per genome, a single price across all individual customers. Family packages at $3,100-$3,250 per genome (two or more genomes purchased together) and B2B institutional pricing at $3,000-$3,200 per genome (for concierge practices, family offices, and corporate programs) provide structured volume pricing without tiered privacy guarantees; the security architecture is identical regardless of price.

4.6 Unit Economics and Capital Plan

Each whole genome sequence is priced at $3,500. Per-genome contribution margin is approximately $2,484 (71% gross margin) after reagent, consumable, and direct labor costs. Cash break-even is reached at 29 genomes per month at full capacity utilization.

Initial capital requirement is approximately $880,000 in equipment and laboratory buildout (Element AVITI sequencer, GPU compute server, storage infrastructure, cryptographic destruction equipment, and lab fit-out). PrivDNA is raising a $1.25M seed round to fund equipment, lease, regulatory certification (CLIA + NY CLEP), and 18 months of operating runway to reach break-even.

No revenue is modeled from data sales, research partnerships, or pharma licensing. Sequencing fees are the entire business model.

4.7 Who We Are, Who Funds Us, Who We Partner With

Founder and leadership. PrivDNA was founded by a sole technical founder with approximately six years of systems administration and IT infrastructure experience in managed service provider and enterprise IT environments. That background is directly applicable to the air-gap, network-isolation, RAID, and physical-security architecture described in this whitepaper. PrivDNA will complement that expertise with hired clinical laboratory leadership at operational launch, per the staffing plan in §8.1: a board-certified CLIA/CLEP laboratory director (0.25 FTE contract), a molecular laboratory technician, a bioinformatics engineer, a genomic concierge, and an office manager / QA coordinator. This division reflects a deliberate choice to separate IT and infosec ownership (held in-house) from wet-lab and bioinformatics execution (hired into the roles where CLIA and CLEP regulation is non-negotiable).

Funding status. At the time of publication, PrivDNA is self-funded and pre-seed. The company is raising a $1.25M seed round to cover equipment, commercial lease, CLIA and CLEP certification, and approximately 18 months of operating runway to reach cash break-even at 29 genomes per month (see §4.6). No outside capital has been accepted as of this writing, and no investors, advisors, or board members are currently disclosed. Parties who join the company in any of those capacities will be listed here in subsequent revisions of this whitepaper and on privdna.com.

Vendors. PrivDNA's planned infrastructure relies on commercial products from the vendors listed below. Each is cited in the technical manifest alongside specific part numbers, datasheets, and pricing.

Category Vendor Role
Sequencing Element Biosciences AVITI sequencer, Cloudbreak flow cells, bases2fastq basecaller
Lab equipment Thermo Fisher, Agilent, Bio-Rad, Eppendorf DNA quantification, fragment analysis, thermal cycling, pipettes, centrifuges
Library prep New England Biolabs NEBNext Ultra II FS DNA Library Prep
Compute hardware AMD, Supermicro, Samsung, NVIDIA CPU (EPYC 9654), 2U chassis and TPM, DDR5 ECC RAM, U.2 NVMe storage (PM9A3), L40S GPU
Bioinformatics software NVIDIA (Clara Parabricks, free in production); all other pipeline tools open source GPU-accelerated alignment and variant calling; Nextflow, nf-core/sarek, BWA-MEM2, GATK, samtools, FastQC, MultiQC
Network isolation Cisco, Netgate Managed switch for VLAN isolation; pfSense+ firewall with no WAN
Delivery hardware Kingston FIPS 140-3 Level 3 IronKey D500S encrypted USB drives
Website and TLS Cloudflare DNS, edge caching, TLS 1.3 termination
Analytics Rybbit (open-source, cookieless) Privacy-respecting site analytics
Code hosting GitHub Open-source pipeline and waitlist site source code

No vendor in the list above has access to, receives, or processes customer genomic data. All genomic data handling will occur on air-gapped infrastructure PrivDNA controls, described in §V. Vendor relationships are limited to the purchase of commercial products and their associated support and maintenance contracts.

Financial conflicts of interest. PrivDNA:

These structural choices are what make the zero-retention commitment economically coherent: there is no business pressure to retain data because there is no line of revenue that depends on it.


V. TECHNICAL ARCHITECTURE

5.1 Sequencing Platform

Primary instrument: Element Biosciences AVITI

Specification Value
Chemistry Avidity sequencing (rolling circle amplification + sequencing-by-binding)
Flow cell Cloudbreak 300-cycle kit (2x150 bp)
Output per run ~300 GB (sufficient for ~3 genomes at 30x)
Quality ≥90% bases above Q30
Run time (2x150 bp) ~38 hours
Instrument price $289,000
Basecalling / demux bases2fastq (Element Biosciences), produces standard FASTQ
Company Element Biosciences (founded 2017, San Diego; AVITI launched 2022)

The Element AVITI is the optimal platform for PrivDNA's volume tier. It delivers clinical-grade sequencing quality at a significantly lower reagent cost than competing platforms: the Cloudbreak 300-cycle kit costs $1,680 per run ($560 per genome at 3 genomes per run), and Element guarantees no reagent price increases for the instrument's lifetime, eliminating the single largest variable cost risk. At ~38-hour run cycles and realistic maintenance and library prep schedules, a single AVITI can process approximately 3-4 runs per week, yielding 9-12 genomes per week or 40-52 genomes per month. With overlapping batches (starting next library prep while the current run is active), throughput of 50-65 genomes per month is achievable.

The AVITI produces standard FASTQ output, making it fully compatible with the entire downstream open-source pipeline (BWA-MEM2, GATK, Parabricks). NVIDIA Clara Parabricks supports AVITI data as of v4.5 via standard FASTQ input.

Supporting Laboratory Equipment

Equipment Model Purpose Estimated Cost
DNA quantification Thermo Fisher Qubit 4 Fluorometric DNA concentration measurement $4,000
Fragment analysis Agilent TapeStation 4150 Library size distribution QC $16,000
Thermal cycler Bio-Rad T100 PCR amplification during library prep $3,500
Centrifuge Eppendorf 5810R Plate spinning, sample pelleting $6,000
Pipettes (single-channel set + multichannel) Eppendorf Research Plus Manual liquid handling $3,200
Vortex mixer Various Sample mixing $500
Microcentrifuge Eppendorf 5424R Tube spinning $3,500

5.2 Air-Gapped Compute Stack

Once operational, all bioinformatics processing will occur on a dedicated, air-gapped server that has no network interface capable of reaching the internet. The server will communicate only with the Element AVITI sequencer via a physically isolated local network segment.

Server Specifications

Component Specification Part Number
CPUs 2x AMD EPYC 9654 (96-core/192-thread, 2.4/3.7 GHz, 384 MB L3) 100-000000789
Chassis Supermicro AS-2125HS-TNR (2U, 24x NVMe hot-swap) AS-2125HS-TNR
RAM 1 TB DDR5-4800 ECC RDIMM (16x 64 GB Samsung) M321R8GA0BB0-CQK
Storage 30 TB usable (8x Samsung PM9A3 7.68 TB NVMe, RAID-10) MZQL27T6HBLA-00A07
GPU NVIDIA L40S 48 GB PCIe (Parabricks acceleration) L40S-48GB
TPM Supermicro TPM 2.0 (Infineon SLB9670) AOM-TPM-9670V

The NVIDIA L40S GPU enables NVIDIA Clara Parabricks acceleration, reducing the full WGS pipeline (alignment through variant calling) from 8-16 hours on CPU alone to approximately 60-90 minutes on GPU-accelerated paths (estimated based on Element Biosciences / NVIDIA published benchmarks; actual runtime to be validated on PrivDNA hardware once assembled). NVIDIA recommends the L40S for Parabricks workloads; while processing time is longer than the A100 (60-90 min vs. 30-45 min), it is well within SLA requirements and reduces GPU CAPEX from $13,000 to $7,500.

Network Isolation

Component Model Purpose
Managed switch Cisco Catalyst 1000 C1000-8T-2G-L VLAN isolation between sequencer and compute server
Firewall Netgate 6100 MAX (pfSense+) Configured with no default gateway; all WAN interfaces disabled

The air gap is enforced at multiple layers:

  1. Physical: No Ethernet cable connects the isolated network to any internet-connected network. Wi-Fi and Bluetooth adapters are not installed.
  2. Logical: The firewall has no WAN configuration. VLAN isolation separates the sequencer subnet from the compute subnet.
  3. BIOS-level: USB boot and PXE boot are disabled. BIOS is password-protected. Secure Boot is enabled.
  4. Port control: Physical USB port blockers (SmartKeeper) on all unused ports. A single controlled transfer workstation handles data export to encrypted USB drives.

AVITI Sequencer Isolation

The Element Biosciences AVITI is configured for fully local operation:

5.3 Open Source Bioinformatics Pipeline

The pipeline processes raw sequencer output into analysis-ready genomic data using exclusively open-source tools:

Pipeline Stages

Stage 1: Basecalling & Demultiplexing
  AVITI run output -> bases2fastq (Element Biosciences) -> FASTQ (R1 + R2 per sample)
  -> FastQC v0.12.1 (per-sample quality control)

Stage 2: Alignment
  FASTQ -> BWA-MEM2 v2.2.1 (alignment to GRCh38 reference) -> Unsorted BAM

Stage 3: BAM Processing
  Unsorted BAM -> samtools v1.23.1 (coordinate sort)
  -> GATK v4.6.1.0 MarkDuplicates (duplicate removal)
  -> samtools index (.bai generation)

Stage 4: Base Quality Score Recalibration
  Sorted BAM + known variant sites (dbSNP, Mills, known indels)
  -> GATK BaseRecalibrator -> GATK ApplyBQSR -> Analysis-ready BAM

Stage 5: Variant Calling
  Analysis-ready BAM -> GATK HaplotypeCaller (-ERC GVCF mode)
  -> Per-sample gVCF

Stage 6: Genotyping & Hard Filtering
  gVCF -> GATK GenotypeGVCFs -> Raw VCF
  -> GATK VariantFiltration (hard filters per GATK best practices) -> Filtered VCF
  Note: VQSR (Variant Quality Score Recalibration) requires cohorts of ~30+
  samples to build a reliable statistical model and is not suitable for
  single-sample workflows. Hard filtering with GATK recommended thresholds
  (QD, FS, MQ, MQRankSum, ReadPosRankSum, SOR) is the standard approach
  for single-sample WGS. The nf-core/sarek pipeline handles this correctly.

Stage 7: Quality Aggregation
  All outputs -> MultiQC v1.33 -> Aggregate HTML quality report

Pipeline Orchestration

The pipeline is orchestrated via Nextflow using the nf-core/sarek framework, a production-validated, community-maintained WGS/WES analysis pipeline. Sarek is the most battle-tested open-source WGS pipeline available, with contributions from dozens of institutions worldwide.

For air-gapped deployment, all dependencies are pre-staged as Singularity container images (.sif files) on an internet-connected staging machine, then transferred to the air-gapped server via encrypted physical media. The total software + reference data package is approximately 80-100 GB.

Reference Genome and Resources

Resource Size
GRCh38 reference FASTA ~3.3 GB
BWA-MEM2 index ~30 GB
GATK resource bundle (dbSNP, HapMap, 1000G, Mills) ~15-20 GB
Total reference data ~50-60 GB

Performance (Per 30x Genome)

Stage Tool Wall Time (CPU) Wall Time (GPU-accelerated)
Basecalling & demux bases2fastq 30-60 min N/A
QC FastQC 10-20 min N/A
Alignment BWA-MEM2 2-4 hours ~15-20 min (Parabricks/L40S)
Sort + dedup samtools + GATK 1-2 hours ~8-10 min (Parabricks/L40S)
BQSR GATK 1-2 hours ~8-10 min (Parabricks/L40S)
Variant calling HaplotypeCaller 3-6 hours ~15-20 min (Parabricks/L40S)
Total 8-16 hours ~60-90 min

Storage Requirements Per Genome

Stage Size Retention
Raw sequencer output (shared per run) 200-300 GB per run Deleted after FASTQ generation
FASTQ (compressed) 60-90 GB Deleted after BAM validation
Analysis-ready BAM + index 80-100 GB Delivered to customer, then destroyed
gVCF 5-10 GB Delivered to customer, then destroyed
Final VCF ~1 GB Delivered to customer, then destroyed
QC reports ~50 MB Delivered to customer, then destroyed
Peak working storage ~400-500 GB During processing only

With 30 TB usable storage (RAID-10), the server provides approximately 20x the peak working requirement for a single run (3 genomes at 1.5 TB peak), sized for roughly 3-5 concurrent or staggered runs plus reference data and software (100 GB), with headroom for temporary pipeline intermediates. The right-sized storage array reduces CAPEX by ~$78,000 compared to a 24-drive configuration while maintaining ample margin for concurrent processing.

5.4 Data Destruction Protocol

PrivDNA will follow NIST SP 800-88 Revision 2 (September 2025), the current authoritative standard for media sanitization, which supersedes the legacy DoD 5220.22-M standard.

Method: Cryptographic Erasure (Purge Level)

All NVMe drives in the server array will be self-encrypting drives (SEDs) with AES-256 encryption enabled from initial deployment. Data destruction will proceed as follows:

  1. Verification: Confirm all deliverables have been transferred to the customer's encrypted USB drive and validated via SHA-256 checksums.
  2. Cryptographic erasure: The drive controller generates a new random Data Encryption Key (DEK), permanently discarding the old key. All previously written data becomes cryptographically irrecoverable.
  3. Completion time: Under 5 seconds per drive, regardless of capacity.
  4. Verification: Post-erasure read verification confirms no recoverable data patterns.
  5. Documentation: Certificate of Destruction generated with media serial numbers, method, timestamp, technician ID, and verification results.

Cryptographic erasure is superior to overwrite-based methods for NVMe/SSD media because it reaches all data including wear-leveling reserves and over-provisioned blocks that overwrite methods cannot access.

End-of-Life Media Handling

When drives reach end of service life, they undergo physical destruction (NIST SP 800-88 "Destroy" level) via shredding through a certified media destruction vendor.


VI. FACILITY DESIGN AND CUSTOMER EXPERIENCE

6.1 Space Requirements

PrivDNA requires approximately 1,200 square feet of commercial space divided into four zones:

Zone Size Purpose
Customer reception and consultation area ~350 sq ft Walk-in reception, Genomic Concierge desk, seating, branding
Glass-walled laboratory ~500 sq ft Sequencer, sample prep benches, server rack, QC equipment
Secure server room (within lab) ~100 sq ft Air-gapped compute server, UPS, environmental controls
Staff area and storage ~250 sq ft Consumables storage, staff workspace, restroom

6.2 The Glass Wall

The defining physical feature of PrivDNA is a floor-to-ceiling tempered glass partition (~20 linear feet) separating the customer area from the laboratory. Design specifications:

The glass wall serves a dual purpose:

  1. Trust architecture: Customers see their sample being collected and barcoded at intake, and watch their data being destroyed in real time at the delivery visit. The intervening sequencing and bioinformatics happen on equipment visible through the glass, but at a timescale (hours to days) that is verified through the open-source pipeline rather than watched live.
  2. Marketing: The visible laboratory creates an experiential retail environment unlike any other genomics service, a "theater of science" that generates organic social media attention and word-of-mouth.

6.3 Customer Journey

The customer experience is two brief visits spanning 4-6 business days. Visit 1 (25 minutes total) covers intake and sample collection. The customer then departs. Between visits, the lab completes DNA extraction, library preparation, a ~38-hour sequencing run, and ~60-90 minutes of bioinformatics processing, all visible through the glass wall to anyone physically in the lobby, but not watched live by the customer. Visit 2 (30 minutes) covers data delivery and witnessed destruction.

Visit 1: Intake and Sample Collection (~25 minutes total)

Step 1: Walk-In or Appointment (~15 minutes)

The customer enters the storefront and is greeted by the Genomic Concierge. The concierge explains the process, answers questions about sequencing technology and privacy protocols, and guides the customer through consent documentation.

Step 2: Sample Collection (~10 minutes)

A trained laboratory technician collects a saliva or buccal swab sample using a standard collection kit. The sample is labeled with a unique barcode in view of the customer. The customer then departs; the remaining processing happens between visits.

Between Visits: Lab Processing (4-6 business days)

Step 3: DNA Extraction and Library Preparation (~2-4 hours)

The technician extracts DNA, checks quality (Qubit quantification, TapeStation fragment analysis), and prepares a sequencing library using NEBNext Ultra II FS DNA Library Prep chemistry. This process occurs after the customer departs and is visible through the glass wall to any customers physically present in the lobby.

Step 4: Sequencing (~38 hours)

The library is loaded onto an Element Biosciences AVITI Cloudbreak flow cell. The sequencing run takes approximately 38 hours. The customer is notified when the run completes.

Step 5: Bioinformatics Processing (~60-90 minutes)

The air-gapped server processes raw sequencer data through the open-source pipeline. With GPU acceleration (NVIDIA L40S), this step completes in approximately 60-90 minutes. A CPU-only fallback takes 8-16 hours. The customer does not need to be present.

Visit 2: Data Delivery and Witnessed Destruction (~30 minutes)

Step 6: Handoff and On-Site Destruction

The customer returns to the storefront. The concierge walks them through the results package on their encrypted USB drive, verifying SHA-256 checksums. The customer then witnesses the data destruction process through the glass wall. They receive:

Total Turnaround: 4-6 Business Days

6.4 The Genomic Concierge

The Technical Representative ("Genomic Concierge") is the face of PrivDNA's privacy guarantee and the most customer-facing hire.

Required Profile:

Responsibilities:


VII. REGULATORY AND COMPLIANCE FRAMEWORK

7.1 Federal Requirements

CLIA Certification

Clinical laboratory testing in the United States requires certification under the Clinical Laboratory Improvement Amendments (CLIA), administered by CMS under 42 CFR Part 493. Whole genome sequencing is classified as high-complexity testing, the highest CLIA category.

Process:

  1. Submit CMS-116 application to the NYSDOH (state survey agency for New York)
  2. Receive Certificate of Registration (temporary; allows testing to begin)
  3. Undergo state survey/inspection
  4. Receive Certificate of Compliance

Lab Director Requirements (42 CFR 493.1443): The lab director for a high-complexity CLIA lab must hold either:

Fees: ~$223 biennially for low-volume labs (Schedule A: 3 or fewer specialties, 2,001-10,000 tests/year)

Timeline: Certificate of Registration in 2-4 weeks; full Certificate of Compliance in 3-6 months.

HIPAA

PrivDNA's cash-pay, no-insurance-billing model may place it outside HIPAA's mandatory scope. Covered entity status under 45 CFR 160.103 depends on whether the entity conducts standard electronic transactions (claims, eligibility inquiries, etc.), not on whether it processes health-related data. Because PrivDNA does not bill insurance, does not submit claims, and does not conduct any HIPAA-defined standard transactions, it may not meet the formal covered entity threshold.

Regardless of formal classification, PrivDNA will voluntarily implement HIPAA-equivalent controls as a matter of policy and brand integrity. PrivDNA holds itself to a higher standard than the regulatory minimum:

PrivDNA's air-gapped, zero-retention model exceeds HIPAA requirements by design. HIPAA requires data protection for as long as data is held. PrivDNA eliminates the holding period entirely.

FDA Laboratory-Developed Tests (LDT) Rule

Whole genome sequencing as a clinical test is a laboratory-developed test (LDT) and falls within FDA's in vitro diagnostic device authority. The FDA finalized its LDT rule in May 2024, bringing LDTs under FDA device regulation (see FDA Final Rule, 21 CFR Part 809.4).

Current enforcement posture. The rule is effective but is being challenged in Association for Molecular Pathology v. FDA in the U.S. District Court for the District of Columbia. The FDA's enforcement timeline for the first phase of requirements has been partially stayed pending the litigation outcome. PrivDNA monitors the case and the FDA's staged enforcement schedule and updates its compliance posture as the regulatory picture clarifies.

PrivDNA's position. PrivDNA plans to comply with FDA LDT requirements on the schedule FDA sets, in parallel with CLIA and NY CLEP certification. PrivDNA's privacy architecture (air-gapped processing, zero retention of genomic data, witnessed destruction) does not alter the test's LDT classification or exempt it from FDA oversight. The service is a clinical laboratory test sold to individuals in the United States; regulatory compliance is scoped accordingly.

Path to market. Depending on the FDA pathway required for consumer whole genome sequencing as an LDT (submission type, review timeline, de novo vs 510(k) considerations), FDA review may extend the time to operational launch beyond the CLIA and CLEP milestones described in Appendix B. PrivDNA's external timelines are communicated as ranges accordingly.

7.2 New York State Requirements

NYSDOH Clinical Laboratory Evaluation Program (CLEP)

New York State requires any laboratory testing specimens originating in New York to hold a NYS clinical laboratory permit, administered by the Wadsworth Center's Clinical Laboratory Evaluation Program. This is one of the most stringent state lab oversight programs in the United States and applies in addition to federal CLIA certification. (For a practical walkthrough of the LDT validation and approval process for NGS assays, see Galbo et al., J. Mol. Diagn., 2025.)

Key requirements:

Fees: $1,100 initial application + $100/year renewal

Timeline: 6-12+ months from application to permit issuance. The LDT review process alone can take several months.

Regulatory basis: NY Public Health Law Article 5, Title V; 10 NYCRR Subpart 58-1

NYC Zoning

NYC's "City of Yes for Economic Opportunity" zoning reform (adopted by the City Council June 6, 2024) significantly expanded where laboratories can operate:

NYC Business Registration

7.3 Voluntary Accreditation

CAP (College of American Pathologists) Accreditation

CAP accreditation is voluntary but functions as CLIA-deemed accreditation status, serves as the industry gold standard for clinical genomics, and is often required by referral partners and payers.

Process: Submit application, complete self-inspection against 3,000+ CAP checklist standards, undergo initial on-site peer inspection, then biennial re-inspections.

Cost: $2,000-$10,000/year depending on lab size and complexity.

Timeline: 6-12 months to initial accreditation.

Note: CAP accreditation does not replace CLEP in New York; both are required.

7.4 Liability Structure

PrivDNA's "raw data only" model significantly reduces medical liability exposure:

GINA Limitations Advisory

The Genetic Information Nondiscrimination Act (GINA) prohibits discrimination by health insurers and employers based on genetic information. However, per NHGRI guidance, GINA does not cover life insurance, disability insurance, or long-term care insurance. Customers are advised to consult with an attorney before undergoing WGS if they have pending applications for these types of insurance. PrivDNA includes this advisory in its pre-sequencing consent documentation.

7.5 Policy Registry

This document, the technical manifest, and the live privacy and security pages on privdna.com together constitute PrivDNA's written policy commitments. The table below is a single reference pointing to where each commitment lives and its current status. Customers, auditors, regulators, and journalists can use this table as a shortcut instead of searching the full whitepaper.

Topic Status Authoritative location
Waitlist data handling (pre-launch phase) Live privdna.com/privacy; covers waitlist email encryption (AES-256-GCM, HMAC-SHA256 duplicate hashing, SQLCipher-encrypted storage), Rybbit cookieless analytics, Cloudflare edge handling, GDPR and CCPA/CPRA and NY SHIELD Act compliance
Vulnerability disclosure and security contact Live privdna.com/security-policy plus privdna.com/.well-known/security.txt (RFC 9116)
Open-source waitlist site Live github.com/danthi123/PrivDNA
Open-source sequencing and chain-of-custody pipeline Planned (at launch) github.com/danthi123/PrivDNA, per §3.4
Companion open-source tool (genomevault): passphrase-based encryption wrapping GA4GH Crypt4GH Live (standalone; not yet integrated into the PrivDNA customer-delivery workflow) github.com/danthi123/genomevault · pypi.org/project/genomevault/
Zero-retention data destruction method In whitepaper §5.4 (cryptographic erasure, NIST SP 800-88 Rev. 2 Purge level); §6.3 (Visit 2 customer-witnessed ceremony); detailed SOP in the technical manifest §6
Response to subpoenas, warrants, and court orders In whitepaper §3.5
Customer sequencing consent form (incorporates GINA advisory and key-loss disclosure) Planned Published in the customer portal before the first operational appointment; incorporates §7.4 GINA advisory and §3.2 key-loss disclosure
Key-loss / no-recovery disclosure In whitepaper §3.2
GINA insurance-coverage limitations advisory In whitepaper §7.4
Anti-kickback and fee-splitting structure for referral partners In whitepaper §9.2
Customer consent withdrawal before delivery In whitepaper §8.1
Operational record retention (LIMS, QC, business systems) In whitepaper §3.2.1
Financial conflicts of interest In whitepaper §4.7

Versioning. This whitepaper is dated in its front matter. Material changes to any of the policy commitments above will be reflected in an updated whitepaper revision and, where a separate live page exists (privacy, security), in a "Last updated" date on that page. PrivDNA does not make substantive policy changes silently.


VIII. OPERATIONAL PLAYBOOK

8.1 Staffing Plan

PrivDNA operates with a lean team designed to fit a single-location, single-instrument throughput of roughly 750 genomes per year:

The Office Manager / QA Coordinator bridges two critical gaps: (1) front desk coverage so the Genomic Concierge can focus on in-lab interactions, and (2) compliance/operations documentation that CLIA and CAP require but that would otherwise fall on technical staff.

Customer consent and withdrawal: Customers may withdraw consent and request sample destruction at any time before delivery of final results. In such cases, all biological samples and any in-process data are destroyed under the standard NIST SP 800-88 protocol, and the customer receives a Certificate of Destruction. No charge applies if withdrawal occurs before sequencing begins; a partial fee may apply after sequencing has commenced.

8.2 Daily Operations

Sample Processing Workflow

A single Element AVITI with Cloudbreak flow cells operates on a batch cycle:

  1. Days 1-2: Collect samples, extract DNA, prepare libraries (batch of up to 3 samples per flow cell)
  2. Day 2-4: Load flow cell and begin sequencing run (~38 hours)
  3. Day 4-5: Sequencing completes; bioinformatics pipeline begins (60-90 min with GPU, 8-16 hrs CPU)
  4. Day 5: QC review, data transfer to encrypted USB, customer notification
  5. Day 5-6: Customer pickup, witnessed destruction, certificate generation

At steady state, with overlapping batches (new library prep begins while current run is active), throughput reaches 3-4 runs per week or approximately 9-12 genomes per week (40-52 per month). With optimized batch scheduling, 50-65 genomes per month is achievable.

Equipment Maintenance

Task Frequency Responsible
Sequencer daily maintenance wash After each run Lab Technician
Flow cell inventory check Weekly Lab Technician
Server health monitoring (SMART, RAID status) Daily (automated alerts) Bioinformatics Engineer
UPS battery test Monthly Bioinformatics Engineer
Environmental monitoring check (temp, humidity) Daily (automated alerts) Bioinformatics Engineer
Calibration verification Quarterly Lab Technician + Director
Proficiency testing Per CLEP/CAP schedule Lab Director

8.3 Quality Control Program

Pre-Analytical QC

Analytical QC

Post-Analytical QC

8.4 Failure Protocols

Failure Type Detection Response Customer Impact
Library prep failure Low concentration or poor fragment profile Re-extract and re-prep from stored sample 2-3 day delay
Sequencing run failure QC metrics out of spec Re-sequence with new flow cell 3-5 day delay
Server hardware failure RAID degradation, SMART alerts Hot-swap failed drive; rebuild RAID No delay (RAID-10 tolerates 1 drive loss per mirror)
Power failure UPS alarm UPS provides 15-30 min runtime; graceful shutdown if extended Minimal (runs resume)
Sample contamination VerifyBamID or unexpected variants Re-collect sample from customer Full restart; no charge for re-run

IX. REFERRAL PARTNERSHIP MODEL

9.1 The Interpretation Problem

PrivDNA intentionally does not interpret genomic data. This is a strategic decision, not a limitation:

  1. Liability reduction: Medical interpretation triggers clinical diagnostic liability, malpractice insurance requirements, and potentially "covered entity" obligations beyond what raw data delivery requires.
  2. Regulatory simplification: Interpretive services face additional scrutiny from NYSDOH CLEP and may require additional personnel qualifications.
  3. Focus: PrivDNA's competitive advantage is in sequencing and privacy, not in genetic counseling. Attempting both would dilute both.

However, customers who receive raw genomic data naturally want to understand what it means. PrivDNA bridges this gap through a structured referral network.

9.2 Partner Network Structure

PrivDNA maintains a pre-vetted directory of independent genetic counselors, clinical geneticists, and interpretation service providers. Partners are selected based on:

Compensation and Anti-Kickback Structure

PrivDNA does not charge partners for referrals and does not receive referral fees from partners. This preserves the independence of the referral relationship and avoids potential regulatory complications under the federal Anti-Kickback Statute (AKS) and applicable New York fee-splitting and self-referral statutes, including NY Education Law §6530(19) for physicians, §6509-a for other licensed professions, and NY Public Health Law §238-a for clinical laboratory services. Because PrivDNA does not bill any federal health care program (Medicare, Medicaid, TRICARE, etc.), the federal AKS may not apply directly, but PrivDNA structures its referral relationships to comply regardless. Referral partners are responsible for their own AKS compliance with respect to their practices.

Instead, the referral network creates value through:

  1. Customer acquisition: Genetic counselors and clinical geneticists refer individuals who want sequencing to PrivDNA
  2. Customer satisfaction: Customers who can easily access interpretation are more satisfied and more likely to recommend PrivDNA
  3. Brand positioning: Association with licensed clinical professionals reinforces PrivDNA's credibility despite not offering interpretation directly

9.3 Referral Workflow

  1. Customer receives their encrypted USB drive from PrivDNA
  2. Customer receives a referral packet listing vetted interpretation partners
  3. Customer contacts partner directly and shares their data at their own discretion
  4. Partner provides interpretation under their own clinical license
  5. PrivDNA has no involvement in or visibility into the interpretation process

APPENDIX A: GLOSSARY OF KEY TERMS

Term Definition
WGS Whole Genome Sequencing: reading the complete 3.2 billion base pairs of the human genome
30x coverage Each position in the genome is read an average of 30 times, ensuring high accuracy
BAM Binary Alignment Map: a file format storing sequencing reads aligned to a reference genome
VCF Variant Call Format: a file listing genetic variants (differences from the reference genome)
gVCF Genomic VCF: a comprehensive VCF that includes non-variant positions for completeness
FASTQ A text-based format for storing raw sequencing reads with quality scores
GRCh38/hg38 The current human reference genome assembly (Genome Reference Consortium, build 38)
Air gap A physical separation between a computer network and any external network, including the internet
SED Self-Encrypting Drive: a storage device that automatically encrypts all data at the hardware level
FIPS 140-3 Federal Information Processing Standard for cryptographic module security (Level 3 = highest practical for portable devices)
CLIA Clinical Laboratory Improvement Amendments: federal laboratory certification program
CLEP Clinical Laboratory Evaluation Program: New York State's laboratory oversight program
CAP College of American Pathologists: voluntary laboratory accreditation program
LDT Laboratory-Developed Test: a test designed, manufactured, and used within a single laboratory
HIPAA Health Insurance Portability and Accountability Act: federal health data privacy law
NIST SP 800-88 National Institute of Standards and Technology Special Publication on media sanitization
BSL-1 Biosafety Level 1: the basic level of containment for work with well-characterized agents not known to cause disease in healthy adults

APPENDIX B: REGULATORY TIMELINE

Month Action
0 Entity formation: NJ LLC filed April 15, 2026
0 Strategic commitment to in-house operation under full CLIA + NY CLEP (April 16, 2026)
1 CLEP pre-application scheduling call with Wadsworth Center; identify lab director candidate
1-2 Formal CLEP pre-application consultation (LDT scope, validation framework, CQ category)
2-3 Submit Certificate of Qualification (CQ) application for lab director
3 Submit CLIA CMS-116 application
3 Submit CLEP initial permit application ($1,100)
3-8 CLEP LDT validation data development (six-phase framework per Galbo et al. 2025)
3-4 Execute commercial lease (NYC C-zone)
4-8 Lab buildout (MEP, glass wall, equipment installation)
6 Receive CLIA Certificate of Registration
8-12 CLEP on-site survey
8-12 CLEP LDT validation review
10-18 CLEP permit issuance (first-time NGS applicant; variable)
10-12 Submit CAP accreditation application
12-16 CAP initial on-site inspection
12-18+ Operational launch (pending CLIA Certificate of Compliance + CLEP permit)

Timeline assumes qualified lab director identified within 60-90 days of seed close. CLEP LDT review for first-time NGS applicants can extend the schedule; PrivDNA plans for a 12-18 month path to operational launch and communicates that range externally.


APPENDIX C: REFERENCES AND SOURCES

Every reference below is a live link to the primary source. A reference audit log is maintained internally; corrections applied during the April 2026 audit are noted parenthetically with each affected entry.

Market Data

Privacy and Consumer Sentiment

23andMe Collapse and Aftermath

Polygenic Risk Scores and Embryo Selection

Regulatory and Standards

Technology: Sequencing and Bioinformatics


END OF DOCUMENT

PrivDNA | privdna.com | New York, New York

Your genome. Your hands. No copies.

Corrections or contributions: edit on GitHub.