The Data Engineer will be responsible for expanding and optimizing the data and data pipeline architecture, data flow and collection for the Data Science team and creating API’s to integrate the models with production systems. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimising data systems and building them from the ground up. The Data Engineer will support the data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
- Create and maintain optimal data pipeline architecture.
- Assemble large, complex data sets from multiple data sources that meet functional / nonfunctional business requirements.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and no SQL technologies.
- Develop data features that will serve as inputs to AI/Machine Learning/OR techniques.
- Build analytics tools that utilize the data pipeline to provide actionable insights into key business performance metrics.
- Develop data design based on exploratory data analysis to meet stated business need.
- Develop procedures to monitor model and production system performance/integrity.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Review and create repeatable solutions through written project documentation, process flowcharts, logs, and commented clean code to produce datasets that can be used in analytics and/or predictive modeling.
- Act as subject matter expert with investigating and evaluating emerging technologies. Articulate potential competitive market benefits of new technologies to senior management. Maintain broad understanding of implementation, integration, and interconnectivity issues with emerging technologies.
Skills And Qualifications
- Possess a Bachelor’s Degree or Master’s Degree in Computer Science, Information Systems or related discipline.
- Minimum 2 years of relevant experience in similar capacity using software/tools for big data, SQL and NoSQL databases and object oriented/object function scripting languages.
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with structured and unstructured datasets.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Possess solid project management and organizational skills.
- Prior experience in supporting and working with cross-functional teams in a dynamic environment.