Job Information
FlightSafety International Inc Director, Software Engineering in Seattle, Washington
Company Overview Docusign brings agreements to life. Over 1.5 million customers and more than a billion people in over 180 countries use Docusign solutions to accelerate the process of doing business and simplify people's lives. With intelligent agreement management, Docusign unleashes business-critical data that is trapped inside of documents. Until now, these were disconnected from business systems of record, costing businesses time, money, and opportunity. Using Docusign's Intelligent Agreement Management platform, companies can create, commit, and manage agreements with solutions created by the #1 company in e-signature and contract lifecycle management (CLM). What you'll do Docusign operates global, always-on services. To safeguard customer trust and empower our engineering teams, we are building the next generation of monitoring and troubleshooting capabilities.] You will lead the engineering organization behind our AI-powered observability platform. Built to complement existing tools like Grafana, Prometheus, and Clickhouse/Azure Data Explorer , this platform addresses the challenges of information overload by making observability accessible to everyone. You will manage a multidisciplinary team of backend engineers, machine learning engineers, and data scientists to build predictive, automated insights while running a highly critical service with strict SLAs. This positon is a apeople manager role reporting to the Senior Director, Software Engineering. Responsibility Manage high-throughput, real-time observability pipelines processing massive volumes of telemetry data across multi-region, multi-cloud environments Operate a Tier-0 data plane with strict SLOs, disciplined change management, and high availability requirements Serve as the observability backbone for every engineering team's reliability and velocity, dramatically reducing the time from "something's wrong" to "here's the problem" Set a clear 12-24 month vision for AIOps capabilities, focusing on automated troubleshooting workflows, proactive anomaly detection, and smart alert aggregation Bridge the gap between robust backend infrastructure and applied machine learning, ensuring models for auto-threshold estimation and pattern recognition are effectively trained and reliably deployed at scale Drive availability, durability, incident readiness, and disaster recovery for the observability plane; run regular resilience drills Lead, hire, and grow a senior-heavy team of backend engineers, ML engineers, and applied scientists. Build an architecture culture, clear career paths, and a high-judgment, high-ownership operating model Own the annual operating plan for the AIOps platform capacity, availability, and budget, meeting or beating committed SLOs/SLAs Drive the development of intelligent capabilities like automated impact analysis, ML-driven threshold estimation, and natural language interfaces to reduce alert noise and accelerate debugging Continually improve ingestion latency, query performance, storage efficiency, and cost per unit while maintaining reliability through traffic spikes and deploys Partner with SRE, Telemetry Platform, Security, Finance, and Product; make pragmatic build-vs-buy decisions; manage vendors and capacity commitments Lead on-call and incident command for the observability platform Job Designation Hybrid: Employee divides their time between in-office and remote work. Access to an office location is required. (Frequency: Minimum 2 days per week; may vary by team but will be weekly in-office expectation) Positions at Docusign are assigned a job designation of either In Office, Hybrid or Remote and are specific to the role/job. Preferred job designations are not guaranteed when changing positions within Docusign. Docusign reserves the right to change a position's job designation depending on bus