Ptah: Secure Edge-AI & Post-Quantum Crypto in Space Systems

Dr. Mohamed El-Hadedy • RSCL @ Cal Poly Pomona

mealy@cpp.edu | 909-869-2594

RSCL Logo Ptah Logo

NASA MINDS 2019 NASA MINDS 2020 NASA MINDS 2021 NASA MINDS 2022 NASA MINDS 2023 NASA MINDS 2024 NASA MINDS 2025

Project Description

The Ptah project is an innovative curriculum and reference implementation designed for the emerging field of secure edge-AI in space and terrestrial applications. Combining hardware diversity—RISC-V accelerators, Raspberry Pi clusters, and NVIDIA edge GPUs—with state-of-the-art cryptography (post-quantum lattice schemes and lightweight AEAD), Ptah demonstrates how to architect resilient, future-proof systems under stringent power, weight, and environmental constraints. Participants will learn to deploy containerized microservices across heterogeneous clusters, orchestrate workloads with K3s, instrument telemetry pipelines with PQC signatures, and perform real-time monitoring using Prometheus and Grafana. Over a 15-week course, students engage in hands-on labs, benchmarking, and system integration, culminating in a comprehensive final quiz covering cryptography, orchestration, hardware design, and performance evaluation.

🔐 Post-Quantum Cryptography (PQC)

Imagine a future where quantum computers render today’s encryption obsolete in minutes. To safeguard critical spacecraft and edge-computing nodes against that threat, we turn to Post-Quantum Cryptography (PQC). Algorithms like CRYSTALS-Dilithium and CRYSTALS-Kyber are built on mathematically rigorous lattice problems—challenges so complex that even a million-qubit quantum computer would take centuries to solve them.

In the harsh environment of space, where remote satellites and deep-space probes cannot be patched on the fly, PQC ensures that firmware updates remain authentic and unforgeable for decades. On terrestrial edge systems—drones, unmanned rovers, and IoT sensors—“harvest-now, decrypt-later” attacks become futile because every telemetry packet, command stream, and key exchange is secured against future quantum decryption.

Why Lattices? Lattice-based schemes provide compact keys and fast operations without sacrificing security. - Dilithium delivers robust digital signatures, so every software bundle, sensor reading, or inter-device handshake bears an unbreakable quantum-resistant stamp. - Kyber enables ultra-secure key-exchange, allowing ground stations to establish shared secrets with spacecraft or edge nodes in a way that remains confidential even under quantum attack.

By integrating PQC into our Ptah framework, we not only future-proof critical systems but do so with performance tuned for power-and-weight-constrained platforms. The result is a security foundation that remains unshakable in the quantum era—because in space, tomorrow’s threats demand today’s unbreakable cryptography.

🔒 Lightweight Cryptography

While traditional ciphers like AES excel in data centers, they’re too heavy for tiny, battery-powered edge nodes. Lightweight cryptography fills that gap by delivering strong security with minimal footprint—CPU cycles, RAM, and power.

AEAD vs. Block Cipher Comparison

Feature AEAD (e.g., Ascon) Block Cipher + MAC (e.g., AES-GCM)
Encryption + Authentication Single pass (atomic) Two steps (encrypt, then tag)
Code Size ≈ 2 – 5 kB ≈ 10 – 20 kB
RAM Usage ≈ 200 – 500 bytes ≈ 1 – 2 kB
Throughput (cycles/byte) 2 – 5 10 – 15
Security Goal Confidentiality & Authenticity Confidentiality & Authenticity

ASCON Internals

Ascon Sponge Diagram
Property Value
Permutation Size 320 bits (5 × 64-bit lanes)
Rate 64 bits / 8 bytes per absorption/squeeze
Initialization Rounds 12
Intermediate Rounds 6
Finalization Rounds 12
Key Size 128 bits (optional 256 bits)
Nonce Size 128 bits
Tag Size 128 bits
Performance (Cortex-M4) ≈ 1 MB/s

ASCON’s design is built around a sponge construction, where data and keys are absorbed into an internal state that is repeatedly permuted. This single-pass approach (absorb-permute-squeeze) gives both encryption and authentication in one go, cutting code size and RAM needs by up to 50% compared with AES-GCM on the same hardware.

Security Strength vs. Block Ciphers

Security Aspect ASCON (128-bit key) AES-128 (GCM)
Bit-security ≥ 128 bits 128 bits
Integrity Bound 2⁶⁴ forgery bound 2⁶⁴ forgery bound
Side-Channel Resistance Simple permutation – easier to mask Complex S-boxes – harder to mask

By choosing ASCON for Ptah’s edge modules, we ensure each micro-controller—or even a small FPGA slice—can authenticate and encrypt telemetry with minimal overhead, leaving headroom for sensor processing and control loops.

⚙️ Orchestration Frameworks

Managing a distributed Edge-AI/PQC cluster requires a lightweight yet powerful orchestrator. Below we compare three leading container orchestration platforms on footprint, feature set, and resource utilization—then dive deeper into how GPU scheduling and CPU allocation work in K3s for drones and UGVs.

Cluster Topology Diagram

Feature & Footprint Comparison

Framework Binary Size Memory Overhead1 Supported APIs Ideal Use Case
Docker Swarm ~200 MB ~150 MB Core Swarm, Stacks Simple clusters & rapid prototyping
K3s ~50 MB ~70 MB Kubernetes v1.x (core) Edge/IoT & power-constrained nodes
Kubernetes ~1 GB+ ~1 GB+ Full k8s API Enterprise datacenters

1 Memory measured as RSS of control-plane components on a baseline Pi 4.

CPU & GPU Resource Allocation

In K3s, you can label nodes with cpu and gpu capacity, then request them in your Pod specs. Below is an example of how a PQC service and an AI inference service would request resources:

# PQC signature service (runs on any CPU node)
resources:
  requests:
    cpu: "0.5"
    memory: "256Mi"
  limits:
    cpu: "1"
    memory: "512Mi"

# AI inference service (runs on GPU-enabled node)
resources:
  limits:
    nvidia.com/gpu: 1
    memory: "1Gi"

Performance Estimates

Node Type CPU Cores Clock (GHz) GPU Cores Approx. Throughput
Raspberry Pi CM4 4 1.50 ~200 Dilithium ops/sec
TRK1 (Rockchip RK3588) 8 2.40 ~1 200 Dilithium ops/sec
Jetson Nano 4 1.43 128 (Maxwell) • GPU: ~500 ASCON ops/sec
• CPU: ~400 Dilithium ops/sec
Jetson Orin NX 6 2.20 1024 (Ampere) • GPU: ~5 000 ASCON ops/sec
• CPU: ~800 Dilithium ops/sec

Which to Choose?

By using K3s with fine-grained resource requests and node labels, you can orchestrate a heterogeneous cluster that maximizes both performance and power-efficiency—crucial attributes for computer architects designing next-generation edge-AI & space systems.

Hardware Architectures

TuringPi TRK1 Raspberry Pi CM4 Jetson Nano Jetson Orin NX ClusterHat 2.5

🛰️ Telemetry & GPS Integration

Robust, low-latency telemetry and precise positioning are critical for autonomous drones, rovers, and space systems. In Ptah, each node—whether a Pi Zero W, CM4, TRK1, or Orin NX—connects to a GNSS receiver (GPS+GLONASS+Beidou) via UART or USB. A dedicated telemetry pod under K3s executes this pipeline:

  1. Acquisition: Multi-constellation fixes at 1–10 Hz (HDOP ≤ 3).
  2. Parsing & Filtering: Normalize NMEA sentences, drop low-accuracy fixes, correct drift.
  3. Cryptographic Protection:
    • Dilithium signature (~2 ms on CM4, < 0.5 ms on Orin NX)
    • Kyber KEM encapsulation (~1 ms on CM4)
  4. Publication & QoS:
    • MQTT (QoS 2) for exactly-once delivery
    • HTTP/2 + TLS-PQC for sub-10 ms end-to-end latency
  5. Self-Healing: K3s probes restart any failed pod within seconds.

Performance & Accuracy Metrics

Metric Pi Zero W Compute Module 4 TRK1 / Orin NX
GNSS Fix Rate (Hz) 1 5 10
Dilithium Sign Latency (ms) 8.0 2.1 <0.5
Kyber KEM Latency (ms) 6.3 1.2 0.3
End-to-End Delay (ms) 20.5 8.4 3.2

Deployment Profiles

📦 Pods & Container Deployment

In Ptah, every core function—post-quantum signing/encryption, telemetry acquisition, and monitoring—is packaged as a self-contained Docker image and deployed as a pod under K3s. This approach yields:

Example Pod Spec

apiVersion: v1
kind: Pod
metadata:
  name: pqc-signer
  labels:
    app: pqc
spec:
  initContainers:
  - name: wait-for-gps
    image: busybox
    command: ["sh", "-c", "until test -e /dev/ttyUSB0; do sleep 1; done"]
    volumeMounts:
      - mountPath: /dev/ttyUSB0
        name: gps-device
  containers:
  - name: signer
    image: rscl/pqc-signer:latest
    resources:
      requests:
        cpu: "0.5"
        memory: "256Mi"
      limits:
        cpu: "1"
        memory: "512Mi"
    volumeMounts:
      - mountPath: /dev/ttyUSB0
        name: gps-device
    livenessProbe:
      exec:
        command: ["pgrep", "signer"]
      initialDelaySeconds: 10
      periodSeconds: 30
  volumes:
    - name: gps-device
      hostPath:
        path: /dev/ttyUSB0
  nodeSelector:
    kubernetes.io/hostname: cm4-node-01

This spec ensures the signer pod only runs on a CM4 node, waits for its GPS device, reserves half a CPU core, and restarts if the process dies—demonstrating the full power of K3s pod orchestration in Ptah’s heterogeneous cluster.

📈 Performance Monitoring

To maintain operational excellence across a heterogeneous Ptah cluster, we employ a best-in-class monitoring stack:

  1. Metrics Collection (Prometheus):Node Exporter on each Linux node (CM4, TRK1, Jetsons, Pi Zeros) scrapes CPU, memory, filesystem, and temperature. • cAdvisor or kubelet metrics expose container-level stats: CPU throttling, memory usage, network I/O. • Custom PQC Exporter in each crypto pod emits counters (signatures/sec, KEM ops/sec) and histograms (latency distribution).
  2. Storage & Retention: • Prometheus TSDB stores high-resolution (1s scrape) data for 24 h, then down-samples to 1 min resolution for 30 days. • Remote write to long-term storage (e.g., Thanos or Cortex) for 1 year of historical analysis.
  3. Visualization (Grafana): • Dashboards for each hardware class:  – CPU & Memory Utilization vs. Crypto Throughput (ops/sec)  – Network Bandwidth & Packet Loss for telemetry streams  – GPU Utilization and Temperature on Jetson modules • Alert rules:  – CPU >90 % for >1 min triggers High-Load alert  – Signature latency >5 ms on CM4 triggers Performance-degradation alert  – Missing telemetry heartbeat (>3 scrapes) triggers Pod-restart action
  4. Sample PromQL Queries:
    # CPU usage on CM4 nodes
    avg(rate(node_cpu_seconds_total{instance=~"cm4-.*",mode!="idle"}[1m])) by (instance)
    
    # PQC ops per second
    rate(pqc_signatures_total[30s])
    
    # Telemetry packet latency
    histogram_quantile(0.95, rate(telemetry_latency_seconds_bucket[5m]))
          
  5. Scalability & Federation: • Shard scraping across multiple Prometheus replicas for large swarms (>100 nodes). • Use Prometheus Federation to centralize critical metrics (e.g., overall cluster health) while preserving local dashboards.

This comprehensive monitoring framework not only provides real-time visibility into resource usage and cryptographic performance but also enables automated alerting and long-term trend analysis—ensuring that Ptah deployments remain robust, performant, and mission-ready.

🎥 Video Resources

For deeper insights and demonstrations, explore our curated video playlist. Each link includes an overview of Ptah concepts, hands-on labs, and system walkthroughs:

15-Week Course Flow

Final Quiz

When you’re ready, dive into the comprehensive 40-question quiz covering every module. You’ll get instant feedback on each answer—all on one page.

Take the Final Quiz

🤝 Acknowledgments

NASA
NASA MINDS
U.S. Navy
U.S. Navy
AMD/Xilinx
AMD/Xilinx
NVIDIA
NVIDIA
AFRL
AFRL
RSCL
RSCL

Special thanks to all our partners for hardware, funding, and expertise that made this course possible.