The ICT Service Senior Associate – Observability & Monitoring ensures effective alert monitoring, event triage, and coordinated incident response across the Bank’s ICT environment. It supports service visibility, early detection of issues, and structured escalation in line with IT Service Management (ITSM) processes. This position provides day-to-day functional input to iNOC operations while working under the direction of the Senior Manager, ICT Service and Change.
Key Roles and Responsibilities
Observability & Monitoring Oversight
- Provide day-to-day functional input on iNOC monitoring activities.
- Ensure alerts, dashboards, and monitoring tools provide accurate and real-time visibility of service health.
- Validate alert thresholds and escalation paths to minimize false positives and missed events.
- Ensure new systems, applications, and infrastructure components are onboarded into monitoring tools prior to go-live.
Event & Incident Coordination
- Participate in event triage, prioritization, and escalation activities in line with Incident Management procedures.
- Work with iNOC analysts during high-priority or major incidents to ensure coordinated response and timely restoration.
- Ensure incidents generated from monitoring tools are properly logged, categorized, and updated in the ITSM system.
- Review monitoring data and incident timelines to support Root Cause Analysis (RCA) processes.
- Escalate systemic service risks or recurring issues to Management.
Monitoring Configuration & Service Instrumentation
- Support configuration and optimization of monitoring tools across infrastructure and applications.
- Work with technology teams to ensure services are properly instrumented and observable.
- Validate that monitoring requirements are included as part of Change Enablement processes.
Automation & Operational Efficiency
- Identify repetitive operational activities within monitoring processes and recommend automation opportunities.
- Support development and testing of automation scripts to improve alert handling and reporting efficiency.
- Maintain documentation of monitoring configurations and automation routines.
Reporting, Risk & Service Assurance
- Prepare service health dashboards and monitoring performance reports.
- Analyze alert trends, incident patterns, and monitoring gaps to support Problem Management and continual improvement.
- Ensure monitoring records and documentation are audit-ready and aligned with regulatory expectations.
- Highlight emerging service risks and performance concerns to ICT Service Management.
Technical Competencies
- Working knowledge of IT Service Management practices, particularly Monitoring, Event Management and Incident Management.
- Hands-on experience with enterprise monitoring or observability tools.
- Basic to intermediate understanding of infrastructure, network, and application monitoring concepts.
- Basic to intermediate scripting capability (e.g., PowerShell, Python, Bash) is an advantage.
- Ability to analyze monitoring data and generate structured operational reports.
Requirements
- Bachelor’s degree in Information Technology, Computer Science, Engineering, or related field.
- ITIL v4 Foundation certification is preferred.
- Relevant technical certifications in monitoring or infrastructure are an advantage.
- 2–5 years’ experience in IT operations, NOC/iNOC operations, service assurance, or monitoring roles.
- Experience in a structured or regulated environment (e.g., banking, financial services, telecommunications) is an advantage.
- Prior experience coordinating or guiding operational teams is preferred.
Skills and Attributes
- Strong analytical and troubleshooting capability.
- Ability to coordinate operational teams during high-pressure incidents.
- Structured and process-driven approach to service management.
- Good communication skills for technical and management audiences.
- High attention to detail and operational discipline.
- Proactive mindset with focus on service reliability and continuous improvement.