Systems Engineer
Augment
Software Engineering
Palo Alto, CA, USA
Posted 6+ months ago
Job Overview:
As a Systems Engineer at Augment, you'll play a crucial role in bridging the gap between complex AI research and real-world SAAS applications. You'll be tasked with designing, implementing, and maintaining our infrastructure, ensuring high availability and optimal performance for AI-driven services catered to a global user base.
Key Responsibilities:
- Product Development: Help design, develop, and maintain product features and services.
- Infrastructure Development: Set up and maintain scalable and highly available systems that underpin our AI-driven SAAS platform.
- Performance Optimization: Monitor the health and performance of systems. Ensure high availability and optimal performance of resources crucial to our AI services.
- Security/Privacy Measures: Implement best practices to maintain the integrity, security, and privacy of our data and infrastructure.
- Collaboration: Work hand-in-hand with our AI researchers to understand their needs and optimize systems to handle intensive AI computations and deployments.
- Troubleshooting: Address and resolve technical issues, ensuring minimal disruption to our AI services. Participate in on-call rotation schedules.
- Automation: Streamline operations by identifying processes ripe for automation and deploying suitable solutions.
- Continuous Learning: Stay abreast with the evolving landscape of AI infrastructure needs and tools. Integrate best practices into our operations to enhance our services.
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field. A Master's degree is an advantage.
- 5+ years of experience as a Systems Engineer, with exposure to AI-driven environments or SAAS platforms being a significant plus.
- Expertise in software development in relevant languages (any of Python, Rust, C++, etc)
- Experience with infrastructure-as-code tools such as Terraform or CloudFormation.
- Proven problem-solving skills, especially in troubleshooting complex issues in distributed systems.
- Strong written and verbal communication abilities.
Desired but optional:
- Proficiency in cloud computing platforms such as AWS, Azure, or Google Cloud.
- Familiarity with containerization technologies like Docker and orchestration platforms like Kubernetes.