Description:
Position Overview: We are seeking a skilled and motivated Data Engineer to join our team. The ideal candidate will play a key role in designing, building, and maintaining the data infrastructure and pipelines necessary to collect, process, and analyze large datasets. This role requires expertise in data architecture, ETL processes, and big data technologies to support advanced analytics and business decision-making.
Key Responsibilities:
- Data Pipeline Development:
- Design, build, and maintain scalable data pipelines for efficient extraction, transformation, and loading (ETL) of data.
- Optimize data flow and collection from various sources to centralized data systems.
- Data Infrastructure Management:
- Develop and manage data warehouses, databases, and other storage solutions.
- Ensure the data architecture supports business intelligence and machine learning initiatives.
- Collaboration with Teams:
- Work closely with data analysts, data scientists, and business stakeholders to understand data requirements.
- Translate business needs into technical specifications for data workflows.
- Data Quality and Governance:
- Implement processes for data quality assurance, validation, and error handling.
- Establish and maintain data governance standards to ensure data integrity and security.
- Big Data and Cloud Technologies:
- Utilize big data technologies (e.g., Hadoop, Spark) and cloud platforms (e.g., AWS, Azure, Google Cloud) to manage and process large datasets.
- Optimize data storage and processing costs on cloud infrastructure.
- Performance Optimization:
- Monitor and improve the performance of data pipelines and infrastructure.
- Resolve data-related issues and implement solutions for scalability and efficiency.
- Documentation and Reporting:
- Document data workflows, processes, and architecture designs.
- Prepare reports and dashboards to provide insights into data operations and performance.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field.
- Proven experience as a Data Engineer or in a similar role.
- Proficiency in programming languages such as Python, Java, or Scala.
- Strong knowledge of SQL and relational database systems (e.g., PostgreSQL, MySQL).
- Experience with big data tools and frameworks (e.g., Hadoop, Spark, Kafka).
- Familiarity with data warehousing solutions (e.g., Snowflake, Redshift, BigQuery).
- Hands-on experience with cloud platforms (e.g., AWS, Azure, GCP) and their data services.
- Knowledge of data modeling, schema design, and ETL/ELT processes.