Date published: 7-Jul-2016
Our mission.As the world’s number 1 job site, our mission is to help people get jobs. We need talented, passionate people working together to make this happen. We are looking to grow our teams with people who share our energy and enthusiasm for creating the best experience for job seekers.
Site Reliability Engineering (SRE) applies software engineering techniques and discipline to production operations to attack major problems and fix them for good. SRE adds nines to our already well-engineered and highly reliable software products supporting job seekers, employers, and internal customers. Every month, over 200 million people count on us to help them find jobs, publish their resumes, process their job applications, and ultimately help them get hired at their next job.
SRE is always on call to keep our products available and fast. The team spans Austin, Dublin, Hyderabad, Seattle, San Francisco, and Tokyo in order to have a follow-the-sun on call rotation. SRE is new at Indeed, and members of this team will have the chance to influence the direction for a critical and global SRE organization. There will be ample opportunities for growth in many areas: technology skills, leadership, mentorship, design, and more.
Participate in the entire software lifecycle including design, delivery, measurement, and learning.
Design, write, ship, and motivate the creation of software and systems to increase product reliability and organizational efficiency.
Support the software lifecycle through activities like reviewing designs, creating platforms and frameworks, capacity planning, and chaos testing.
Maintain service health by through monitoring and follow-the-sun incident response.
Improve service reliability through root cause analysis, blameless postmortems, and using code to prevent or respond to problem recurrence.
- 4+ years of Linux experience as an administrator or power user
- 2+ years of scripting experience in Shell, Perl, or Python
- 2+ years of experience with configuration management in Puppet, Chef, Ansible, or Salt
- 2+ years of experience troubleshooting live TCP/IP, routing, and HTTP issues in production
- Hands-on operational experience managing JVM services
- Demonstrated ability to operate as a high-performing member of a small, cross-functional team
- Experience teaching more junior engineers
- Experience interviewing System Administrators or Site Reliability Engineers
- Able to troubleshoot TCP/IP, switching, routing & HTTP issues
- BS degree in Computer Science or related technical field, or equivalent practical experience
- Able to solve Linux troubleshooting challenges. You understand system limits and how to check if we’re hitting them.
- Able to write code. You use automation to make your job more efficient.
- Comfortable with managing multiple projects simultaneously.
- Meticulous and cautious. You consider edge cases and risk mitigation strategies for every change.
- Able to learn from mistakes
- Solid communicator with great customer service skills
- Extremely curious about how things work
Indeed is proud to be an equal opportunity employer, seeking to create a welcoming and diverse environment.
All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status.