Member of Technical Staff (AI Supportability)
Cockroach Labs
Category-defining tech. Career-defining work.
Lots of tech companies disrupt. But, many fail when they try to scale. We're different. CockroachDB makes it easier for companies to build and scale apps. This is how and why we're helping some of the most innovative companies on the planet. We tackle problems head-on and focus on solutions that create lasting impact.
Because when our customers win, we all win.
The Role
Can an LLM accurately diagnose complex database issues from telemetry data? We're building a team to find out.
Supporting CockroachDB is inherently challenging. Our product has thousands of unique metrics, complex structured data, and emits detailed logs at high frequency, making support tickets expensive and time-consuming to resolve. We're launching an ambitious R&D initiative to explore how AI can reduce these support costs and improve diagnostic accuracy.
We're looking for engineers who thrive in exploratory, prototype-driven environments. You'll start with a concrete mission: validate whether LLMs can perform root cause analysis on real Zendesk escalations using CockroachDB telemetry data. You'll experiment with OpenAI/Anthropic APIs, prompt engineering techniques, and various data contexts to determine if AI can assist with diagnostic workflows and reduce the time technical support engineers spend on complex root cause analysis.
This isn't about executing a predetermined roadmap. It's about rapid experimentation, learning from real data, and building the foundation for AI-powered support workflows. If you're energized by building from scratch, working directly with end users, and technical challenges where the outcome isn't guaranteed, this role is for you.
You Will
- Start with focused validation: Design experiments to test whether LLMs can assist with root cause analysis on real Zendesk escalations, determining if AI can accelerate diagnostic workflows and reduce time-to-resolution
- Work embedded with our Bangalore technical support team: Partner with support engineers through regular collaboration sessions to understand diagnostic workflows, review experimental results, and gather insights
- Work with real CockroachDB data: Process diagnostic telemetry including cluster metrics, SQL traces, system logs, plus observability data from Datadog
- Iterate on AI approaches: Experiment with OpenAI/Anthropic APIs, prompt engineering techniques, and different data contexts to optimize LLM diagnostic performance
- Build rapid prototypes: Transform promising validation results into proof-of-concept tools and agents, defining next steps based on what the data tells us
- Measure and document progress: Establish clear success metrics for LLM accuracy and track experimental results throughout the validation process
- Scale successful experiments: Use Go, Python, and cloud infrastructure to build scalable applications from validated prototypes
- Explore advanced AI patterns: As validation results warrant, investigate sophisticated approaches like agentic workflows, MCP integrations, and multi-step reasoning chains for complex diagnostic scenarios
The Expectations
In the first month, you will join the AI Supportability team and immerse yourself in CockroachDB's observability landscape and support workflows. Embed with technical support engineers to understand diagnostic challenges firsthand while beginning LLM experimentation for supportability use cases.
After 3 months, you'll have validated core assumptions about LLM-powered diagnostics and be actively prototyping the next-generation of AI support tools. You will have established strong collaborative relationships with technical support teams and identified the most promising areas for AI automation based on real experimental data.
As this initiative grows, you will have the opportunity to define the future of AI-driven customer support at Cockroach Labs. Your Applied AI work will establish whether and how AI can scale our support organization, potentially influencing product development for improved customer self-service. This initiative will determine our long-term AI supportability strategy.
You Have
- Experimental mindset: You're energized by ambiguous problems, rapid prototyping, and discovering solutions through data-driven experimentation
- Applied AI/ML work: Experience with LLMs, prompt engineering, and model evaluation through professional work, significant side projects, or deep technical exploration
- Strong programming skills: Proficiency in Go, Python or similar languages, with ability to quickly learn new technologies and APIs
- Data and systems experience: Comfort working with large-scale structured data, APIs, observability tools, logs, metrics, and distributed systems concepts
- Greenfield experience: Background building new products, prototypes, or research projects where the solution path wasn't predefined
- Embedded collaboration experience: Experience working directly with internal stakeholders, gathering requirements, and iterating based on user feedback
- Technical communication: Ability to explain complex experimental results and technical concepts to diverse audiences
- Ideally 5+ years of software engineering experience with hands-on AI/ML work, but we prioritize your ability to learn, experiment, and deliver measurable results
- BE/B-Tech/M-Tech in Computer Science or equivalent experience
The Team
Abhishek Munnolimath - Senior Engineering Manager
You will report directly to Abhishek, who will be leading the execution of the AI Supportability roadmap, managing the team's technical direction, and ensuring focus on delivering measurable improvements to support efficiency.
Namrata Kodali - Director of Engineering
Namrata leads the AI Supportability initiative, defining the strategic roadmap and partnering with Abhishek to drive technical execution of this effort. Namrata joined Cockroach Labs in 2020, and leads the Observability teams. Previously, she spent several years at Yext in New York City, leading teams working on everything from their API and platform to billing and subscriptions. She was initially drawn to Cockroach Labs' culture and people, and has developed a passion for working in Observability, helping make CockroachDB more understandable. Outside of work, she enjoys spending time outdoors, climbing, biking, and hiking.
BabuSrithar - Site Lead, India
BabuSrithar is the Site Leader for India. He is responsible for our growth strategy and is a cultural champion in the region. He is passionate about building high-quality software products and lean teams by leveraging everyone's potential. He enjoys working with people and learning along the way. Before joining Cockroach Labs, BabuSrithar held senior leadership positions at companies like Nutanix, Clumio and recently he was VP of engineering at Apty where he led the engineering globally. When not at work, he enjoys his time with his 3-year-old and family.
Cockroach Labs is proud to be an Equal Opportunity Employer building a diverse and inclusive workforce. If you need additional accommodations to feel comfortable during your interview process, please email us at accessibility@cockroachlabs.com.
Cockroach Labs has a hybrid work model, with Roachers that are local to one of our offices coming in on Mondays, Tuesdays, and Thursdays and working flexibly the rest of the week. While we’ve learned valuable lessons working remotely, nothing can replace the connection, creativity, and fun that occurs when Roachers get together and we are committed to fostering a workplace that encourages collaboration and allows us all to do our best work.
Benefits
- Medical Insurance
- Flexible Time Off
- Paid Holidays
- Paid Parental Leave
- Mental Wellbeing Benefits
- And more!