CBSD · NGS IT Readiness Review CBSD · NGS IT 准备度审视

Unified IT Blueprint for MGISeq2K & NovaSeq 6000 MGISeq2K 与 NovaSeq 6000 联合运行 IT 架构蓝图

Consolidates CBSD legacy deliverables with the latest rack, network, and storage inputs to keep facility, compute, backup, and cloud automation aligned before Q4 2024.

汇总 CBSD 现状、既有 NGS 项目文档与最新机架/网络/存储配置,确保在 2024 Q4 前完成机房、算力、备份与云端自动化交付。

Monthly Throughput 月度测序/分析吞吐 ≥100 ≈57 TB transfer / month · Report Infrastructure.docx ≈57 TB/月数据转运 · Report Infrastructure.docx
Sequencer VLAN Bandwidth Sequencer VLAN 带宽 10 Gb Redundant fabric · VLAN 4 / 7 冗余链路 · VLAN 4 / 7
Thermal Load 环境热负荷 ~38,000 BTU/h Dedicated cold aisle + UPS backup 独立冷通道 + UPS 备援
Go-Live Target 上线目标 2024-02-01 Power + network checkpoints by Dec 20 12/20 完成供电与网络节点

Full Scenario Architecture 全场景架构图

Layered diagram covers access, control, data/compute, and cloud/DR zones with VLAN IDs and expected bandwidth to accelerate onboarding.

图示覆盖接入、控制、数据/算力与云灾备各层,标注 VLAN 与带宽要求,便于快速对齐交付范围。

Access · VLAN4 接入 · VLAN4

User & Lab Edge 用户与实验端

172.23.63.0/24 · dual 10Gb uplinks 172.23.63.0/24 · 双 10Gb 上联

  • Work Client 1–4, LIMS stations, lab Wi-Fi aggregate run prep. Work Client 1–4、LIMS 终端与实验室 Wi-Fi 承载 Run 准备。
  • Dedicated NE1072T uplinks anchor ACLs into SR630 Crown gateway. 专用 NE1072T 上联并通过 SR630 Crown Gateway 做 ACL 控制。
Control · Sequencer 控制 · 测序

Instrument Fabric 仪器网络

Dual 10Gb per MGISeq2K/NovaSeq, PacBio, Bionano MGISeq2K/NovaSeq、PacBio、Bionano 各配双 10Gb

  • New cores replace unavailable NE10032 ports to host VLAN4/7. 因 NE10032 不可用,新增核心交换机承载 VLAN4/7。
  • Isolated ACLs bridge to Crown data domains only through SR630 policies. 通过 SR630 策略隔离 ACL,仅与 Crown 数据域互通。
Data · VLAN7 数据 · VLAN7

Storage & Compute Plane 存储与算力平面

10.0.6.0/24 · 40Gb GPFS fabric 10.0.6.0/24 · 40Gb GPFS 网络

  • New 500TB Storwize/Quantum pools plus Cohesity tier for landing. 新增 500TB Storwize/Quantum 池与 Cohesity 分层构成登陆区。
  • SR630/R740xd, GPU & RAM-heavy nodes deliver WGS/PacBio compute. SR630/R740xd 及 GPU/大内存节点支撑 WGS/PacBio 算力。
Cloud · DR 云 · 灾备

Hybrid & Resiliency 混合云与韧性

Crown 179.17.1.0/24 · AWS Direct Connect / VPN Crown 179.17.1.0/24 · AWS Direct Connect / VPN

  • Direct Connect, Snowball, and partner MPLS deliver global transport. Direct Connect、Snowball 与合作方 MPLS 承载全球传输。
  • Cohesity cloud tier provides policy-based backup, extra landing, and offline retention. Cohesity Cloud Tier 提供策略化备份、额外登陆区与离线留存。
SEQ

Sequencers 测序仪

  • MGISeq2K & NovaSeq stream raw reads and QC metrics. MGISeq2K 与 NovaSeq 实时输出原始读段与 QC 指标。
  • Instrumentation APIs forward run manifests to SR630. 仪器 API 将 Run 元数据推送至 SR630。
STG

Landing & Storage 登陆区与存储

  • 500TB staging pools issue checksum + Cohesity snapshot automatically. 500TB 登陆池自动执行校验并触发 Cohesity 快照。
  • Policy engine tags clinical vs. pre-clinical buckets. 策略引擎区分临床/前临床桶。
CTL

SR630 Control SR630 管控

  • IAM + ACL determine who can promote data upstream. IAM 与 ACL 决定可向上游传输的主体。
  • Automation gateway schedules transfers, audit logs every hop. 自动化 Gateway 编排传输并记录日志。
CMP

Compute Cluster 计算集群

  • R740xd / GPU nodes run WGS, WTS, PacBio pipelines. R740xd/GPU 节点运行 WGS、WTS、PacBio 流水线。
  • Signed results staged for AWS export or local delivery. 经签名的结果可导出至 AWS 或本地交付。
AWS

AWS Analytics AWS 分析层

  • Direct Connect/S3/Snowball pipelines hand off to global bioinformatics. Direct Connect/S3/Snowball 管线将数据交付全球生信。
  • Results sync back to Crown LIMS and Cohesity cloud tier. 结果回写 Crown LIMS 并同步至 Cohesity Cloud Tier。
Flow animation mirrors the automated sequencing → storage → control → compute → AWS pipeline validated in recent deployments. 动画流呈现 Sequencer→存储→管控→算力→AWS 的自动化链路,与近期交付保持一致。

Design Priorities 设计重点

Scope 范围
  • Fast track VLAN4/7 commissioning for sequencers 优先完成 Sequencer 所需 VLAN4/7 开通
  • Reserve 2×42U racks + UPS footprint 预留 2×42U 机柜与 UPS 占地
  • Align AWS/Direct Connect handoff with MPLS Direct Connect 与 MPLS 交付需同步

Delivery Window 交付窗口

Milestones 里程碑
  • Dec 20 · Power + network baseline ready 12/20 · 完成供电与网络基线
  • Jan 15 · Storage, backup commissioning 01/15 · 存储与备份联调
  • Feb 1 · Sequencer & AWS automation online 02/01 · Sequencer 与 AWS 自动化上线

Current State Snapshot CBSD 现状快照

Data Hall Constraints 机房容量紧张
  • No spare racks; evaluate lab-adjacent space or reclaim archive aisle capacity immediately. 无空余机柜;需立即评估实验室邻近空间或回收归档通道容量。
  • Existing NE10032/NE1072T capacity cannot be reused—plan for new core/ToR switches to host VLAN4/7. 现有 NE10032/NE1072T 端口不可用,VLAN4/7 需新增核心/接入交换机。
  • Storwize pool is at capacity; zero landing space remains, so procure additional 500TB staging immediately. Storwize 池已满,登陆区无可用空间,需立刻准备额外 500TB 预生产存储。
Environment & Power Risks 环境与电力风险
  • Confirm CRAC tonnage to cover 38k BTU/h in the server room; current HVAC status is unknown. 需确认机房空调是否能覆盖 38k BTU/h 热负荷,当前 HVAC 能力未知。
  • No dedicated 30A circuits in the lab; NovaSeq, PacBio, and Tape feeds plus UPS capacity must be revalidated. 实验室缺少专用 30A 回路,NovaSeq/PacBio/Tape 供电与 UPS 容量均需重检。
  • UPS coverage for both server room and lab racks is missing—budget for dual UPS strings and bypass panels. 机房与实验室机柜均无 UPS 覆盖,需采购双路 UPS 及旁路柜。

Readiness Checklist · 8 Domains 项目准备 Checklist · 八大域

Server Room / Data Center
  • Finalize new rack locations and floor loading for 2×42U racks plus UPS blocks. 确认 2×42U 机柜 + UPS 占地与楼板承重。
  • Deploy fresh core/ToR switches to deliver 10/40Gb with dual fiber from lab to room. 布署新核心/接入交换机,提供 10/40Gb 并完成实验室至机房双链路。
  • Define 500TB clinical + 500TB preclinical zoning and tiering. 定义临床/前临床各 500TB 的分区与分层策略。
Power & UPS
  • Deliver 30A×1 (PacBio), 15A×1 (NovaSeq), 20–30A×2 (Tape) before Dec 20. 12/20 前就绪:30A×1(PacBio)、15A×1(NovaSeq)、20–30A×2(Tape)。
  • Provide redundant UPS strings and static bypass for sequencers, storage, and core network. Sequencer、存储与核心网络需双路 UPS + 静态旁路。
  • Lock ownership between facility staff and external electricians with clear outage windows. 明确设施团队与外部电工责任及停机窗口。
Network & Security
  • Build redundant VLAN4 (172.23.63.0/24) fabric with ACLs toward Crown core services. 搭建冗余 VLAN4(172.23.63.0/24)并与 Crown 核心服务配置 ACL。
  • Extend 40Gb GPFS VLAN7 (10.0.6.0/24) to AFM/NSD/Protocol nodes. 将 40Gb GPFS VLAN7 (10.0.6.0/24) 延伸至 AFM/NSD/Protocol。
  • Budget Direct Connect/S3/Snowball automation plus VPN to CBCN. 规划 Direct Connect/S3/Snowball 自动化与 CBCN VPN。
Storage & Backup
  • Provision dual 500TB landing zones with 40Gb storage fabric. 建设双 500TB 登陆区并接入 40Gb 存储网络。
  • Use Cohesity clusters for both backup and extended-capacity tiers. 使用 Cohesity 集群同时承担备份与扩展容量分层。
  • Confirm retention targets and DR metrics (RPO/RTO). 确认保留策略与灾备指标(RPO/RTO)。
Compute & Workstations
  • On-prem HPC: PowerEdge R740xd compute + SR630 management. 本地 HPC:PowerEdge R740xd 计算节点 + SR630 管理节点。
  • Add GPU and high-memory nodes for WGS/PacBio (>3000 CPU hrs/sample). 增补 GPU 与大内存节点,应对 WGS/PacBio(>3000 CPU 小时/样本)。
  • Deploy two QC workstations + NAS on VLAN4 for run validation. 在 VLAN4 上部署 2 台 QC 工作站与 NAS,支持 Run 校验。
Data Flow & Workflow
  • Sequencer → NE1072T → SR630 gateway → Storwize/Quantum → analytics. Sequencer → NE1072T → SR630 Gateway → Storwize/Quantum → 分析集群。
  • Automate AWS transfers (Direct Connect/S3/Snowball) budgeted at ≥12k USD/month. 自动化 AWS 传输(Direct Connect/S3/Snowball),预算 ≥12k USD/月。
  • Integrate LIMS and Cohesity for end-to-end audit trails. LIMS 与 Cohesity 打通端到端审计链。
Environmental
  • Validate HVAC capacity, temperature/humidity monitoring, and alarm points. 确认 HVAC 容量以及温湿度监控与告警点位。
  • Engineer NovaSeq chimney exhaust and cold/hot aisle separation. 完善 NovaSeq chimney 排风与冷热通道隔离。
  • Improve cabling baffles to avoid hotspots. 布置线缆挡板,避免局部热点。
Timeline & Budget
  • Milestones: 12/20 power/network, 01/15 storage/backup, 02/01 run. 里程碑:12/20 供电/网络,01/15 存储/备份,02/01 运行。
  • Clarify CapEx/Opex split and Phase-1 cloud fallback. 明确 CapEx/Opex 与 Phase-1 云应急方案。

Sequencer Architecture Sequencer 运行架构

MGISeq2K

  • VLAN4 / 172.23.63.6, 10Gb into new NE1072T, dual uplink to core.VLAN4 / 172.23.63.6,10Gb 接入新 NE1072T,双上联至核心。
  • Writes to 500TB preclinical landing (Storwize → GPFS → Cohesity/Tape).写入 500TB 前临床登陆区(Storwize → GPFS → Cohesity/Tape)。
  • MGI Agent on SR630 gateway syncs to on-prem cluster or AWS buckets.SR630 Gateway 上的 MGI Agent 同步至本地集群或 AWS 存储桶。
  • Power: UPS + dedicated 15A circuit with monitoring.电力:UPS + 专用 15A 回路并监控。

NovaSeq 6000

  • VLAN4 / 172.23.63.7 with isolated ACLs, redundant 10Gb paths.VLAN4 / 172.23.63.7 独立 ACL,冗余 10Gb 通道。
  • Feeds 500TB clinical pool before sharing through 40Gb storage LAN.写入 500TB 临床池后通过 40Gb 存储网共享。
  • ≥64GB RAM + 3000 CPU hrs/sample for WGS/SNV workloads.WGS/SNV 负载需 ≥64GB RAM + 3000 CPU 小时/样本。
  • 15A circuit + UPS; confirm chimney exhaust path.15A 回路 + UPS;需确认 chimney 排风路径。

Shared Components 共用组件

  • SR630 management, SR650/PacBio nodes for future growth.SR630 管理节点与 SR650/PacBio 节点支撑扩展。
  • NAS/Protocol/NSD/AFM ride the 40Gb VLAN7 GPFS fabric.NAS/Protocol/NSD/AFM 运行于 40Gb VLAN7 GPFS 网络。
  • Crown interface (179.17.1.0/24) and AWS Direct Connect traverse SR630 gateway.Crown 接口 (179.17.1.0/24) 与 AWS Direct Connect 通过 SR630 Gateway 互通。
  • Cohesity delivers cold tiering, air-gap snapshots, and export workflows.Cohesity 提供冷数据分层、Air-Gap 快照与导出流程。

Network & Address Plan 网络与地址规划

VLANVLAN Purpose用途 Subnet / Rate网段 / 速率 Key Nodes关键节点
2 IPMI / ManagementIPMI / 管理 172.23.64.0/24 · 1Gb Sw-Mgt-01/02, PDU, UPS, SR630 BMC
3 Platform Services平台服务 172.23.69.0/24 · 1Gb AFM, Protocol, NSD, Archive Fabric
4 Sequencer & WorkstationsSequencer 与工作站 172.23.63.0/24 · 10Gb MGISeq, NovaSeq, PacBio, Bionano, QC clients
5 AWS Transfer / AutomationAWS 传输 / 自动化 10.0.50.0/24 · 1Gb SR630 Gateway, AWS Transfer Family, Automation Agents
6 Crown InterfaceCrown 接口 179.17.1.0/24 SR630 Crown Data Gateway, VPN
7 Data / GPFS数据 / GPFS 10.0.6.0/24 · 40Gb AFM/NSD/Protocol/Archive/SR630/SR650

Storage & Tiering Strategy 存储与分层策略

Landing & Primary 登陆区与主存储

  • Greenfield build: deploy dual 500TB landing pods (Storwize + NVMe cache) with zero legacy storage.全新建设:部署双 500TB 登陆集群(Storwize + NVMe Cache),无任何旧有存储。
  • Define clinical vs. preclinical zoning and growth policy before hardware arrives.硬件到货前即完成临床/前临床分区与扩容策略定义。
  • Commission new 40Gb fabric (replacement core + FC Switch 8969-F24) to link sequencer → compute → storage.启用全新的 40Gb Fabric(替换核心 + FC Switch 8969-F24),贯通 Sequencer→计算→存储链路。

Backup & Archive 备份与归档

  • Cohesity acts as the backup + extra storage tier with immutable snapshots.Cohesity 作为备份与扩展存储层,支持不可变快照。
  • Policy-based tiering from Cohesity to AWS Glacier Deep Archive (0.002 USD/GB) or N. Virginia promo (1 USD/TB).通过 Cohesity 策略分层至 AWS Glacier Deep Archive(0.002 USD/GB)或北弗吉尼亚 1 USD/TB 促销。
  • Expansion shelves can be added to Cohesity for landing overflow prior to cloud tiering.可为 Cohesity 增加扩展盘柜,在上云分层前承接溢出数据。

Data Transfer & Cloud Connectivity 数据传输与云连接

57TB/month cross-border traffic requires ≥264 Mbps sustained throughput to close within 30 days.

57TB/月的跨国数据量需 ≥264 Mbps 持续带宽,才能在 30 天内完成上传。

Option方案 Description描述 Timeline / SLA周期 / SLA Reference Cost费用参考
AWS Direct + Public Suzhou → HK POP → Glacier / Direct Connect → Zurich/Geneva.苏州 → 香港 POP → Glacier / Direct Connect → Zurich/Geneva。 Depends on 1Gbps internet, Snowball limited, cross-border self-managed.依赖 1Gbps 互联网,Snowball 受限,跨境需自运维。 100TB transfer ≈12,237 USD/month; Glacier deep archive 205 USD/month.传输 100TB ≈12,237 USD/月;Glacier 深归档 205 USD/月。

Short term: AWS S3 + Snowball; long term: blend dedicated circuits/VPN with AWS plus Cohesity tiering.

短期可用 AWS S3 + Snowball;长期建议专线/VPN 与 AWS 混合,并结合 Cohesity 分层。

Cost Breakdown for PPT 费用分类汇总

Category分类 Scope / Key Actions范围 / 关键动作 Estimate费用估计 Notes备注
Facility New racks, floor reinforcement, HVAC, UPS, structured cabling新增机柜、地板加固、空调、UPS、布线 45k – 80k USD Power/CRAC upgrades due Dec 2012/20 完成配电/空调升级
Network New core/ToR, 40Gb storage switches, fiber plant新核心/接入、40Gb 存储交换、光纤布放 35k – 60k USD (excl. circuits) Direct Connect / partner VPN adds 12k–20k USD/month OpexDirect Connect / 合作方 VPN 另需 12k–20k USD/月 Opex
Compute SR630 management, R740xd compute, GPU/RAM nodes, workstationsSR630 管理、R740xd 算力、GPU/大内存节点、工作站 150k – 250k USD Includes Crown interface/Agent gateway, NAS含 Crown 接口/Agent Gateway、NAS
Storage New Storwize clusters, NVMe landing shelves, 40Gb FC, NAS新 Storwize 集群、NVMe 登陆盘柜、40Gb FC、NAS 300k – 1.7M USD Variante 1–4 sourced from global vendorsVariante 1–4 为海外报价
Backup Cohesity clusters, expansion shelves, Glacier tieringCohesity 集群、扩展盘柜、Glacier 分层 120k – 220k USD Includes 5-year 24×7×4 intl. support含 5 年 24×7×4 国际保修
Data Transfer AWS fees / circuits / VPN / SnowballAWS 费用 / 专线 / VPN / Snowball 12k USD+/month (cloud) + 15k–25k USD/month if dedicated circuits云侧 12k USD+/月;如需专线再增 15k–25k USD/月 Tie into Cohesity Cloud Tier automation结合 Cohesity Cloud Tier 自动分层

Global Pricing Benchmarks 国外报价对标

Benchmarks sourced from U.S. / EU vendors to guide CBSD negotiations.

引用美国/欧洲供应商报价,便于 CBSD 采购对标。

Category品类 Vendor / Model海外供应商 / 型号 Highlights配置亮点 Ref. Price (USD)参考价 (USD)
HPC Node Dell PowerEdge R740xd (US) 2×Gold 6248R, 384GB RAM, 4×GPU ready, 25Gb NIC2×Gold 6248R, 384GB RAM, 4×GPU 位, 25Gb NIC ~34,500 w/ NBD ProSupport
Storage Quantum QXS-4 (EU) 24×16TB NL-SAS, dual controllers, 40GbE24×16TB NL-SAS,双控,40GbE ~92,000 per array
Backup Cohesity C4000 Cluster (US) 4×nodes, 120TB usable, CloudArchive/CloudTier licenses4×节点,120TB 可用,含 CloudArchive/CloudTier 许可 ~185,000 incl. 3-year support
Cloud Network AWS Direct Connect 1Gb (HK) Cross connect + port + data transfer交叉连接 + 端口 + 数据传输 0.25 USD/h port + 0.025 USD/GB data
Dedicated Circuit Equinix ECX + Partner MPLS 1Gb dual sites + proactive monitoring1Gb 双站点 + 主动监控 ~32,000/month (36 mo term)

Plan → Quote → Build → Run 方案 → 报价 → 采购 → 配置 → 运维

Baseline 现状校准

  • Inventory room/HVAC/network/storage assets.完成机房/空调/网络/存储资产清点。
  • Answer eight key questions (power, racks, network, storage, compute, flows, environment, budget).回答配电、机柜、网络、存储、算力、数据流、环境、预算八大问题。
  • Lock VLAN/IP/ACL plus AWS dataflow diagram.锁定 VLAN/IP/ACL 及 AWS 数据流图。

Design & Pricing 方案与分类报价

  • Split budget by facility/network/compute/storage/backup/transfer.按机房/网络/算力/存储/备份/传输拆分预算。
  • Collect Quantum/Cohesity/circuit lead times.获取 Quantum/Cohesity/专线交付周期。
  • Approve Phase-1 cloud vs. Phase-2 on-prem compute strategy.明确 Phase-1 云与 Phase-2 本地算力策略。

Procure & Deploy 采购与部署

  • Execute room retrofit, power, UPS, cabling; stand up 10/40Gb network.完成机房改建、配电、UPS、布线,部署 10/40Gb 网络。
  • Rack SR630/R740xd/Storwize/Cohesity; finalize VLAN/ACL.上架 SR630/R740xd/Storwize/Cohesity 并完成 VLAN/ACL。
  • Test sequencer UPS, automation gateways, NAS/workstation connectivity.测试 Sequencer UPS、自控 Gateway、NAS/工作站连通。

Operate 配置与运维

  • Automate LIMS → dataflow → AWS Transfer → Cohesity tiering & CloudArchive.自动化 LIMS→数据流→AWS Transfer→Cohesity 分层与 CloudArchive。
  • Enable monitoring for HVAC, UPS, current, VLAN health, Spectrum Scale.上线 HVAC、UPS、电流、VLAN、Spectrum Scale 监控。
  • Publish SOP for room checks, incident response, cloud cost control.发布机房巡检、故障响应、云成本优化 SOP。

Open Questions 待确认的关键问题