Design, develop and monitor system operations for digital wellness platforms and
applications for major client companies in the fitness and wellness industry.
Specific responsibilities include:
- Contribute to the design, development and procurement of cloud based infrastructure
- Locate and direct solutions anticipating challenges involving software and hardware interfaces
- Ensure the smooth operation of all systems, monitoring their performance and continually automating cleanup and improvement while ensuring all uptime thresholds are met.
- Prevent and when necessary solve problems stemming from inefficient CP usage, disk space demand and/or queuing issues.
- Document key architectural developments, perform dependency upgrades, audit current metrics and alarms with a view to ensuring current metrics and alarms are sufficient to meeting or surpassing all operational goals and thresholds.
- Test key functions, replicate problems reported in trouble tickets and ensure all fixes result in the inability to replicate problems.
- Create reports and search logs as needed.
- Maintain good working relationships with fellow team members and client representatives.
QUALIFICATIONS AND REQUIREMENTS:
- At least 5 years experience and familiarity with node.js, Google Cloud Platform products and services, kubernetes and SQL
- Solid understanding of message queuing, stream processing, and highly scalable “big data” stores
- Experience performing root cause analysis and testing
- Experience with system monitoring software tools such as prometheus, grafana, and datadog
- Extensive experience documenting and testing system operations and architecture.
- Strong project management and organizational skills
- Looking for a “neat and tidy” person that likes to have things running smoothly.