We are currently hiring at our Gurgaon, India office.
We will consider highly qualified candidates in most geographies.
Job Summary:
The Hadoop Architect works closely with the data architecture team and developers to extend the Elevondata data lake framework in support of a multiple clients. The variety of data across our Clients enterprise is extensive and will include a lot of unstructured and semi-structured data.
This individual will support our Senior Data Management Advisors in a technical capacity for the details of solution architecture for the development of the data layer with a focus on scalability, manageability, performance, security, data lineage, and metadata.
The position requires a thorough understanding of the Hadoop ecosystem tools for the extension/build of data ingestion/storage, data transformation and data load routines. Also requires understanding of data provisioning to downstream consumers and processes and eventual consumption by analysts for experimental query (Spark SQL, Impala, Python etc.)
Role Description:
The Hadoop Architect needs to help design and architect data ingestion, ELT, and data quality platform
Responsibilities:
- Build, Deploy and Support custom data management applications on Hadoop
- Build, Deploy and Support ETL/ELT to load data into Hadoop/NoSQL, and Scheduling concepts and tools in Hadoop Ecosystem
- Design, Build, Deploy and Support schemas for data acquisition, transformations, and data integration
- Build, Deploy and Support solutions for metadata, data quality, security management
- Build, Deploy and Support Hadoop data ecosystem, in-situ query, and Web Services
- Performance tuning of a Hadoop/NoSQL environment
Qualifications:
- Bachelor’s degree in Computer Science or equivalent experience
- Specific experience required on Hadoop (HDFS) technology and associated Apache Open Source ecosystem (Hive, Pig, MapReduce, HBase, Pig, Sqoop et al)
- ELT experience (possibly Pentaho Kettle or Talend), Sqoop
- Exposure to NoSQL/columnar data stores like Cassandra, InfiniDB, ParAccel (RedShift) and document stores like MongoDB et al.
- Deep proficiency with data management – traditional RDBMS (Oracle, MS SQL et al) and MPP appliances (Netezza, Teradata et al) and open source DBMS (MySQL, PostgresSQL et al)
- Working knowledge of Services Layer – Java, HiveQL, RESTful, JSON, Maven, Subversion, JIRA, Eclipse et al.
- Ability to wear multiple hats spanning the software-development-life-cycle across Requirements, Design, Code Development, QA, Testing and Deployment
- Experience with Object-Oriented Programming Languages (Java, C++, Python)
- Excellent Problem Solving Skills
- Excellent oral and written communication skills
- Working knowledge of DR in Hadoop Ecosystem
- Working knowledge of handling fine grained entitlements in Hadoop Ecosystem
Highly Desired Qualifications:
- Minimum 8-10 years developing in data warehouses or digital analytics with 2-4 years in Hadoop ecosystem
- Exposure to AWS Big Data Architecture and Services
- Experience with MPR2.0/YARN Cluster Manager and implementation experience on at least one of the commercial Hadoop Distributions (Cloudera, Hortonworks, MapR, IBM etc.)
- Experience working in an agile team environment.