Big Data Engineer

Pittsburgh, PA, USA

Ranked as #12 on Forbes’ List of 25 Fastest Growing Public Tech Companies for 2017, EPAM is committed to providing our global team of over 24,000 people with inspiring careers from day one. EPAMers lead with passion and honesty, and think creatively. Our people are the source of our success and we value collaboration, try to always understand our customers’ business, and strive for the highest standards of excellence. No matter where you are located, you’ll join a dedicated, diverse community that will help you discover your fullest potential.

DESCRIPTION


You are curious, persistent, logical and clever – a true techie at heart. You enjoy living by the code of your craft and developing elegant solutions for complex problems. If this sounds like you, this could be the perfect opportunity to join EPAM as a Big Data Engineer. Scroll down to learn more about the position’s responsibilities and requirements.

EPAM’s Financial Services Practice is looking for exceptionally talented people to join our team of world-class engineers. Our clients are some of the world’s largest and most innovative banks, investment banks and wealth management institutions.

We currently have 7 openings in Pittsburgh, PA for a Big Data Engineer looking to work with the team in India to introduce new practices and approaches, design and develop new test frameworks, setup CI/CD and integrate them with test automation.

As a Big Data Engineer, you will be working on the collecting, storing, processing, and analyzing of large sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for integrating them with the architecture used across the company.

Requirements

  • Proficient understanding of distributed computing principles;
  • Management of Hadoop cluster (Cloudera preferred), with all included services;
  • Ability to solve any ongoing issues with operating the cluster;
  • Proficiency with Hadoop v2, MapReduce, HDFS, Sqoop;
  • Experience in building stream-processing systems, using solutions such as Storm or Spark-Streaming;
  • Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala;
  • Experience in Spark;
  • Experience in integration of data from multiple data sources such as Microsoft SQL Server, Oracle;
  • Good understanding of SQL queries, joins, stored procedures, relational schemas;
  • Experience in NoSQL databases, such as HBase, Cassandra, MongoDB (preferred);
  • Knowledge of various ETL techniques and frameworks, such as Flume;
  • Experience in various messaging systems, such as Kafka or RabbitMQ;
  • Experience in Cloudera FS domain knowledge is a big plus but not required.