Job Description:
• Define and evolve the end-to-end architecture for Unstructured’s data transformation and retrieval platform.
• Build and scale distributed systems that process massive volumes of unstructured data across diverse formats and sources.
• Serve as the company-wide authority on Kubernetes orchestration, cluster design, performance tuning, and reliability.
• Lead Python architecture and best practices—ensuring performance, modularity, and maintainability across services.
• Design and optimize Postgres schemas, queries, and indexing strategies to support large-scale metadata and retrieval pipelines.
• Mentor senior engineers through design reviews and code guidance, raising the bar for technical excellence across the org.
• Partner with the infrastructure and product teams to translate research prototypes into production-grade systems.
• Evaluate emerging technologies and open-source tools in LLM infrastructure, retrieval, and orchestration—deciding where and how to integrate them.
Requirements:
• Have 15+ years of software engineering experience with a focus on distributed systems, infrastructure, or data architecture.
• Are a Python expert—capable of building frameworks and performance-critical services from scratch.
• Have deep Kubernetes expertise; you can design, deploy, and debug at scale and could teach others how to productionize it securely.
• Are fluent in Postgres—you understand query planning, partitioning, and tuning for high-throughput environments.
• Are obsessed with clean, scalable architecture and can lead design reviews that shape how entire systems evolve.
• Have experience in high-performance data or AI/ML systems—especially those involving retrieval pipelines, embeddings, or hybrid workloads.
• Thrive in fast-moving, ambiguous environments where technical depth and judgment matter more than process.
Benefits:
• Competitive salary + equity + full benefits package