Cloud Fleet Health Management Engineer

Hillsboro, OR 97123
  • Job Code
Job Description

Datacenter Platform Engineering Architecture Group (DPEA) Cloud Engineering team is looking for engineering talents to work with CSP (Cloud service provider) for solving server platform datacenter deployment, integration, fleet health management challenges, Systems Engineering infrastructure, including performing high-level systems, hardware and software support.

In this position, your responsibilities will include but not be limited to:

  • Work with system failures data and partner with internal and external technical leadership team to build infrastructure and define processes to enable at scale data analysis and responses.

  • Providing problem determination and resolution, with CSP to install and update newly developed software or firmware. Identify alternatives for optimizing CSP's system resources.

  • Based on information from CSPs, analyze and evaluate existing or proposed systems. Work across functions and/ or product team, such as networking to ensure connectivity and compatibility between systems for deployment.

  • Scale prototype tools to at-scale deployment in CSP.

  • Possess extensive experience with operating systems, network protocols, systems programming and configurations, and/or hubs, switches, and routers.

  • Understand CSP's capacity planning, disaster recovery, and security plan, where necessary.

  • Work closely with CSP to implement best practices when it comes to securing application and infrastructure, both public cloud and private cloud systems.

Behavioral Traits:

  • Problem Solving skills.

  • Written/verbal communication skills working directly with customers.

  • The ideal candidate will have a self-directed work ethic and a can-do attitude.

  • Work within a team environment facing fast changing requirements and complicated stakeholders.


You must possess the below minimum qualifications to be initially considered for this position. Preferred qualifications are in addition to the minimum requirements and are considered a positive factor in identifying top candidates. Requirements listed may be obtained through a combination of industry relevant job experience, internship experiences and or schoolwork/classes/research.

Minimum Requirements:

The candidate must have a Master's degree with 3+ years of applicable industrial experience or a Bachelor's degree with 6+ years of applicable industry experience in Electrical Engineering, Computer Science or related field.

Minimum Qualifications:

  • Experience with datacenter product development - must have a fundamental understanding of software stacks and/or how software and hardware work together.

  • Intel Architecture with design experience or working knowledge on CPU, Memory, Chipset and/or Platform.

  • Low-level debugging skills that enable the root causing of issues cross hardware, firmware and/or Operating System levels.

Preferred qualifications:

  • Experience on assembly code development with micro code, pcode development flow will be a plus.

  • CPU flows and experience on silicon level debug, for example: AFD (Array Freeze and Dump) and its analysis are preferred.

  • Operating System (OS), Driver, BIOS and/or firmware fundamentals.

  • Experience with Debug tools such as ITP (In-Target Probe) and C-script development.

  • Experience with Python SV and/or programming with Python.

  • Prior Experience in Cloud deployment strategies, Cloud Developer Environments (AWS, Google Cloud Platform (GCP), Azure, containers) and having any cloud certification.

  • Experience developing custom data models and algorithms to apply to available data sets, testing it and then deploying on big data.

  • For BIOS domain - x86 server BIOS development background and debug experience with Intel XDP would be a must. Experience of ACPI, PCIe, RAS (Reliability, Availability and Serviceability) security NVRAM, etc, is a plus.

  • For OS domain - experience on Linux kernel debug would be a must. Experience of debugging fixing Linux system power and performance issues is a plus Experience of Linux kernel upstream development is a plus.

  • For Hardware and I/O domain - 3+ years enabling experience of hardware I/O or devices would be a must including but not limited to Ethernet, etc; server baseboard design experience is a plus.

  • For Server Management domain - development experience of IPMI redfish NCSI Node Manager and data center management philosophy is preferred.

Inside this Business Group

The Data Center Group (DCG) is at the heart of Intels transformation from a PC company to a company that runs the cloud and billions of smart, connected computing devices. The data center is the underpinning for every data-driven service, from artificial intelligence to 5G to high-performance computing, and DCG delivers the products and technologiesspanning software, processors, storage, I/O, and networking solutionsthat fuel cloud, communications, enterprise, and government data centers around the world.

Other Locations

US, California, Folsom;US, California, Santa Clara;US, Texas, Austin

Posting Statement

All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

Cloud Fleet Health Management Engineer

Hillsboro, OR 97123

Join us to start saving your Favorite Jobs!

Sign In Create Account