HomeJobsFull Time

Senior Platform Engineer

Cosine LogoCosine


Date Posted

22 November, 2025

Salary Offered

$75,000 — $100,000 yearly

Job Type

Full Time

Experience Required

3+ years

Remote Work

Not Allowed

Stock Options

No

Vacancies

1 available


About the Role

We’re looking for a Senior Platform / Infra Engineer to own the core infrastructure that powers Cosine’s products — from Kubernetes and deployment pipelines to networking and platform services.

You’ll design and run the “paved road” that our engineers, researchers, and customers build on: reliable Kubernetes clusters, fast and safe CI/CD, solid observability, and hardened environments for demanding enterprise and on-prem deployments. You’ll also wear a classic “DevOps/SRE” hat: thinking in SLOs, running incident response, and keeping us up even as we move quickly.

This is a high-ownership role at a fast-paced, venture-backed Silicon Valley startup. You’ll work directly with founding engineers and leadership, and your decisions will materially shape how we build and ship products.


What You’ll Do

  • Own core infrastructure
    • Design, operate, and evolve our Kubernetes-based platform (EKS or similar), including cluster topology, node groups, autoscaling, and multi-environment isolation.
    • Manage supporting cloud resources: container registries, load balancers, queues, caches, and data infra needed to run our APIs and agents.
  • Build the deployment & tooling layer
    • Design and maintain CI/CD pipelines for image builds and infra rollouts (e.g. Pulumi/Terraform + Helm/Docker).
    • Implement safe rollout strategies (blue/green, canary, staged rollouts) and fast rollback paths.
    • Build internal tools and abstractions that make it easy for product teams to self-serve infra safely.
  • Own reliability & operations (SRE-ish)
    • Define and track SLOs/SLIs for key services (latency, error rates, availability).
    • Improve our observability stack (metrics, logs, traces, alerts) so issues are obvious, actionable, and debuggable.
    • Participate in the on-call rotation, lead incident response when needed, and drive blameless post-mortems and fixes.
  • Shape networking & security
    • Design and maintain networking: VPCs, subnets, ingress/egress, service meshes / L7 routing, DNS, and TLS.
    • Implement least-privilege access via IAM, secure secret management, and hardened configurations for multi-tenant and isolated customer environments.
    • Help design patterns for secure enterprise and on-prem / regulated deployments.
  • Partner with product & research
    • Work closely with application, ML, and research teams to understand their needs and translate them into reusable infra building blocks.
    • Provide guidance on “how to run this in production” — capacity planning, failure modes, and operational readiness reviews.

You Might Be a Great Fit If You

  • Have strong experience
    • 5+ years building and operating production infrastructure on a major cloud (AWS, GCP, or Azure).
    • Significant hands-on experience running Kubernetes in production (EKS/GKE/AKS or self-managed):
      • Cluster upgrades, autoscaling, node group design, and multi-env setups.
      • Helm or similar for packaging services.
  • Think in infrastructure-as-code
    • Deep experience with IaC tools (Pulumi, Terraform, CDK, or similar).
    • Comfortable managing infra changes via code review, CI, and automated rollouts.
  • Care deeply about reliability
    • Have owned the uptime and performance of user-facing systems.
    • Comfortable participating in (and improving) on-call rotations and incident management.
    • Experience setting up / tuning observability (Prometheus, Grafana, CloudWatch, OpenTelemetry, etc.).
  • Build great tooling & abstractions
    • You’ve built internal tools, libraries, or platforms on top of cloud providers so product teams can move faster with fewer foot-guns.
    • You think about developer experience and “golden paths,” not just raw infra.
  • Are comfortable in code
    • Strong scripting and programming skills in at least one modern language (e.g. TypeScript, Go, Python).
    • Happy to dive into app code when needed to debug a production issue or improve an integration.
  • Have the startup mindset
    • Enjoy working in a fast-moving environment with evolving priorities and incomplete specs.
    • Bias toward pragmatic solutions: ship something small, measure, iterate.
    • Communicate clearly, give/receive direct feedback, and collaborate across functions.

Nice to Have (Not Required)

  • Experience with:
    • AWS primitives like EKS, ECS/Fargate, ECR, SQS, ElastiCache/Redis.
    • Argo CD or other GitOps tools for Kubernetes.
    • On-prem, air-gapped, or regulated industry deployments (e.g. finance, healthcare).
    • AI/ML infrastructure (GPU workloads, model hosting, feature stores).
  • Prior experience as an early infra / platform hire at a startup.

About Cosine

Cosine Logo

Fully Agentic SWE

Company Size: 11 - 50 People
Year Founded: 2022
Country: United States

BEAMSTART

BEAMSTART is a global entrepreneurship community, serving as a catalyst for innovation and collaboration. With a mission to empower entrepreneurs, we offer exclusive deals with savings totaling over $1,000,000, curated news, events, and a vast investor database. Through our portal, we aim to foster a supportive ecosystem where like-minded individuals can connect and create opportunities for growth and success.

© Copyright 2025 BEAMSTART. All Rights Reserved.