We are seeking a highly skilled and motivated System Ops Engineer with a strong background in computer science or statistics, and at least 5 years of professional experience. The ideal candidate will possess deep expertise in cloud computing (AWS), data engineering, Big Data applications, AI/ML, and SysOps. A strong technical foundation, proactive mindset, and ability to work in a fast-paced environment are essential.
Key Responsibilities:
Proficient in AWS services including EC2, VPC, Lambda, DynamoDB, API Gateway, EBS, S3, IAM, and more.
Design, implement, and maintain scalable, secure, and efficient cloud-based solutions.
Execute optimized configurations for cloud infrastructure and services.
Develop, construct, test, and maintain data architectures, such as databases and processing systems.
Write efficient Spark and Python code for data processing and manipulation.
Administer and manage multiple ETL applications, ensuring seamless data flow.
Lead end-to-end Big Data projects, from design to deployment.
Monitor and optimize Big Data systems for performance, reliability, and scalability.
Hands-on experience developing and deploying AI/ML models, especially in Natural Language Processing (NLP), Computer Vision (CV), and Generative AI (GenAI).
Collaborate with data scientists to productionize ML models and support ongoing model performance tuning.
Utilize DevOps tools for continuous integration and deployment (CI/CD).
Design and maintain Infrastructure as a Service (IaaS) ensuring scalability, fault-tolerance, and automation.
Manage servers and network infrastructure to ensure system availability and security.
Configure and maintain virtual machines and cloud-based system environments.
Monitor system logs, alerts, and performance metrics.
Install and update software packages and apply security patches.
Troubleshoot network connectivity issues and resolve infrastructure problems.
Implement and enforce security policies, protocols, and procedures.
Conduct regular data backups and disaster recovery tests.
Optimize systems for speed, efficiency, and reliability.
Collaborate with IT and development teams to support integration of new systems and applications
Qualifications:
Bachelor's degree in Computer Science, Statistics, or a related field.
5+ years of experience in cloud computing, data engineering, and related technologies.
In-depth knowledge of AWS services and cloud architecture.
Strong programming experience in Spark and Python.
Proven track record in Big Data applications and pipelines.
Applied experience in AI/ML models, particularly in NLP, CV, and GenAI domains.
Skilled in managing and administering ETL tools and workflows.
Experience with DevOps pipelines, CI/CD tools, and cloud automation.
Demonstrated experience with SysOps or cloud infrastructure/system operations.