Software Engineer - Hadoop, Big Data Tools and Automation

Cupertino, CA 95014
  • Job Code
    200166631
  • Jobs Rated
    8th
Summary

Summary

Posted: May 21, 2020

Weekly Hours: 40

Role Number:200166631

This position can be located in Santa Clara Valley (CA) or Austin (TX)

Imagine what you could do here. At App...Summary

Summary

Posted: May 21, 2020

Weekly Hours: 40

Role Number:200166631

This position can be located in Santa Clara Valley (CA) or Austin (TX)

Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish.

Apple's Applied Machine Learning team has built systems for a number of large-scale data science applications. We work on many high-impact projects that serve various Apple lines of business. We use the latest in open source technology and as committers on some of these projects, we are pushing the envelope. Working with multiple lines of business, we manage many streams of Apple-scale data. We bring it all together and extract the value. We do all this with an exceptional group of software engineers, data scientists, SRE/devops engineers and managers.

Key Qualifications

  • Experience building tool for management of large hadoop cluster for the adminitartion and management.
  • Experience building tools for the capacity planning show resource consumption by users and jobs for capacity planning, showbacks and chargeback.
  • Experience with Kafka and streaming technologies.
  • Experience building large data pipelines for engrossing/egressing data with minumum resource requirements.
  • Expert level understanding of Hadoop based technologies - HDFS/Yarn cluster administration, Hive, Spark.
  • Expertise in python and java.
  • Expert understanding of Unix/Linux based operating system.
  • Excellent problem solving, critical thinking, and communication skills.
  • Experience deploying and managing CI/CD pipelines.
  • Expertise in configuration management (such as Ansible, salt) for deploying, configuring, and managing servers and systems.
  • The candidate should be adapt at prioritizing multiple issues in a high pressure environment.
  • Should be able to understand complex architectures and be comfortable working with multiple teams.
  • Should be highly proactive with a keen focus on improving uptime availability of our mission-critical services with automation and tooling.
  • Comfortable working in a fast paced environment while continuously evaluating emerging technologies
  • The position requires solid knowledge of secure coding practices and experience with the open source technologies.

Description

We manage several large hadoop/YARN clusters running 10's of thousands of jobs.

This role requires you an expert level understanding of hadoop/spark based technologies so that you can build right automation/tooling for the administration, capacity management and showback/chareback/resource visibility of the platform.

This also requires understanding of complete ecosystem of kafka, spark streaming and other streaming technologies, airflow to build a comprehensive end to end management and monitoring systems.

You are an independent problem-solver who is self-directed and capable of exhibiting deftness to handle multiple simultaneous competing priorities and deliver solutions in a timely manner.

Provide incident resolution for all technical production issues.

Create and maintain accurate, up-to-date documentation reflecting configuration, and responsible for writing justifications, training users in complex topics, writing status reports, documenting procedures, and interacting with other Apple staff and management.

Provide guidance to improve the stability, security, efficiency and scalability of systems.

Determine future needs for platform and investigate new products and/or features.

Strong troubleshooting ability will be used daily; will take steps on their own to isolate issues and resolve root cause through investigative analysis in environments where the candidate has little knowledge/experience/documentation.

Education & Experience

BS in computer science with 7+ years or MS plus 4+ years experience or related experience.

Additional Requirements

  • Experience with Kubernetes, Docker Swarm, or other container orchestration framework
  • Experience building and operating large scale hadoop/spark data pipeline used for machine learning in a production environment
  • Experience in tuning complex hive and spark queries
  • Expertise in debugging hadoop/spark/hive issues using Namenode, datanode, Nodemanager, spark executor logs.
  • Exeprience in Workflow and data pipeline orchestration (Airflow,Oozie,Jenkins etc.)
  • Experience in jupyter based notebook infrastructure.


Jobs Rated Reports for Software Engineer

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

Software Engineer - Hadoop, Big Data Tools and Automation

Apple, Inc.
Cupertino, CA 95014

Join us to start saving your Favorite Jobs!

Sign In Create Account
Software Engineer
8th2017 - Software Engineer
Overall Rating: 8/199
Median Salary: $100,690

Work Environment
Good
53/199
Stress
Very Low
24/199
Growth
Very Good
32/199