Job Information
Ensono Senior Applications Support Specialist in United Kingdom
Senior Applications Support SpecialistRemote - United KingdomJR013381
Key Responsibilities
Incident & Problem Management
Lead major incident (MI) bridges and restore service with minimum business impact.
Handle all L3 escalations , perform deep diagnostics across Java, JVM, middleware, OS, and infra.
Own technical RCAs , drive long‑term and systemic remediation.
Identify recurring failure patterns and risks.
Reliability Engineering
Apply SRE principles : SLIs/SLOs, error budgets, resilience patterns.
Tune JVM parameters , analyze thread/heap dumps, and improve performance.
Influence application architecture for fault tolerance, scalability, and recoverability .
Validate DR readiness , failover behavior, and resilience testing outcomes.
Change, Release & Risk
Provide technical approval and risk assessment for high-risk changes.
Enforce operational readiness for new apps and major releases.
Ensure changes meet audit, compliance, and regulatory expectations .
Automation, Monitoring & Observability
Build advanced automation using Shell/Python/PowerShell .
Develop frameworks for health validation , automated recovery, and compliance checks.
Define observability standards; optimize alerts and improve MTTR .
Leadership & Mentorship
Mentor L1/L2 teams; review and approve runbooks, SOPs, and KB articles.
Act as a trusted technical advisor to stakeholders and leadership.
Skills & Qualifications
Technical (Mandatory)
Strong knowledge of application architecture, distributed systems, and middleware .
Java expertise : JVM internals, GC, memory management, thread/heap dump analysis, performance tuning.
.Net — CLR internals, garbage collection, memory management, thread/dump analysis, and application performance tuning.
Strong Unix/Linux , networking basics, and advanced scripting ( Shell/Python/PowerShell/VBS ).
Advanced SQL and understanding of databases; Autosys (or equivalent scheduler).
Handson with observability tools : Splunk, AppDynamics/Dynatrace, ELK, Grafana, Prometheus.
Reliability & Operations
Major incident leadership, deep RCA, change/release readiness, DR & resilience engineering.
Experience in regulated production environments .
Soft Skills
Strong technical leadership and decision‑making.
Clear communication during high‑pressure incidents.
Ownership mindset and business awareness.
Experience & Education
7–12+ years in Application Reliability, Production Support, SRE, or platform operations.
Bachelor’s degree in Computer Science/Engineering or equivalent.
ITIL, cloud, or industry certifications (preferred).
Banking/financial domain experience (preferred).
Working Conditions
On‑call and after‑hours support as required.
Fast‑paced environment with multiple priorities.
Hybrid working model
JR013381