RQ10203 - Systems Administrator/Operations Support Specialist - Intermediate
Maarut IncEast York, Canada
Job Description
Job Description
Job Description
Deliverables:
- 24/7/365 Monitoring and Support of
- Systems and subsystems to ensure availability as per defined SLAs, by taking prompt and necessary actions or escalating to technical support as needed. These services include critical, essential, time-sensitive, public-facing applications, websites, systems, and subsystems hosted in OPS Guelph data centres.
- Incident Management and Escalation
- Rapid identification, analysis, and resolution of production issues. Timely escalation to technical support teams, clients, and management to minimize service disruptions.
- Operational Coverage for Mission-Critical Systems
- Maintain availability and performance of essential public-facing applications and systems hosted in the Guelph Data centre,
- Responsible for analysis, investigation, and resolution of production batch processing failures.
- Providing operational system updates, upgrades, and patches, and addressing issues encountered during these tasks.
- Implementing batch and online system change requests.
- Ensuring service level commitments to ITS clients, stakeholders, and broader public sector agencies are met.
- Participating in the development and delivery of related training, communications, and procedural documents.
- Planning and participating in semi-annual disaster/contingency recovery exercises, including running and testing recovery procedures and plans for all application systems.
- Coordinating the collection of data for statistical analysis of production performance results, used to assess workload.
Key Responsibilities:
- End-to-end monitoring of OPS services and underlying infrastructure such as mainframe, UNIX, Windows servers, storage, and network devices 24/7/365 to ensure availability as per defined SLAs, by taking prompt, necessary actions or escalating to Tier 2/3 or vendors as needed. These services include critical, essential, time-sensitive, public-facing applications, websites, systems, and subsystems hosted in GDC.
- Responding to high-priority requests and incidents. Participating in system recovery and service restoration efforts and meetings.
- Providing operational system updates, upgrades, and patches; coordinating activities to address issues encountered during these tasks.
- Initiating established recovery and/or escalation procedures.
- Implementing change requests for mainframe, midrange, and network platforms.
- Ensuring service level commitments to ITS clients, stakeholders, and broader public sector agencies are met.
- Participating in the development and delivery of related training, communications, and procedural documentation.
- Participating in departmental initiatives such as Shift Left and Agile Monitoring.
- Participating in the yearly SysTrust audit.
- Participating in the semi-annual disaster recovery exercises.
Requirements
Experience and Skill Set Requirements:
Must Have:
- 5-8+ years of demonstrated experience supporting large zOs mainframe systems.
- Service Management Tools such as eSMT, CIT, Remedy and ITIL.
- OS365 Suite of products and Power BI
Nice to Have -
- ITIL Certifications
Skill Set Requirements:
zOs mainframe systems:
- 5-8+ years of experience supporting large ZOS mainframe systems.
Knowledge of server OS hardware components and end to end system management:
- Demonstrated knowledge of Unix and Wintel servers and network peripherals.
Knowledge and Experience:
- Demonstrated Knowledge of Incident and Change Management activities
- Demonstrated ability for strong documentation and writing skills.
- Demonstrated experience in presentation, verbal and written communication skills.
Knowledge of Service Management Tools such as eSMT, CIT, Remedy and ITIL:
- Service Management Tools such as eSMT, CIT, Remedy and ITIL.
Knowledge of infrastructure monitoring and performance tools:
- Demonstrated knowledge of System and Application Monitoring Tools.
Knowledge of OS365 Suite of products and Power BI:
- Demonstrated knowledge of Communications, Collaborations and analytics tools.