Match score not available

Data Architect

Remote: 
Full Remote
Experience: 
Senior (5-10 years)
Work from: 

Turtle Trax S.A. logo
Turtle Trax S.A. Startup https://www.turtle-trax.com/
2 - 10 Employees
See all jobs

Job description

SUMMARY

As a hands-on Data Architect, you will be crucial in designing, building, and optimizing the data architecture for our next-generation SaaS platform. This position requires expertise in event-driven data architectures (e.g., Apache Kafka) and emerging technologies such as vector databases to support Generative AI applications. You will be deeply involved in implementing scalable, high-performance data systems that drive real-time analytics, AI applications, and dynamic data processing. Experience in the Utility industry and knowledge of AWS is preferred as we seek to optimize data systems that cater to the unique demands of this sector.

The development organization leverages Java, Spring Boot, AWS RDS (Postgres, SQL Server), Oracle, AWS Serverless technologies (Lambda, SQS), REST, JavaScript, and Mobile development with React Native hosted in AWS using Atlassian tools (Jira, BitBucket, andConfluence).

Dealbreakers:

Hands-on experience with Python and/or Java is required. Must have practical experience with data streaming use cases. Strong verbal and written communication skills are essential for engaging with stakeholders in technical roles.

Highlight Responsibilities:

This person is directly responsible for maintaining a consistent focus on the aspects of data and collaborating with business, technical stakeholders to implement a robust set of data capabilities consistent with our non-functional and functional requirements.

JOB FUNCTIONS

Duties and Responsibilities

  • Design & Build Data Infrastructure: Architect and implement scalable, high-performance data infrastructure focusing on event-driven architectures, real-time data streaming, and advanced AI-driven applications.
  • Event-Driven Data Solutions: Develop event-driven systems leveraging tools like Apache Kafka or similar technologies to support real-time data processing and low-latency pipelines.
  • Hands-on Development: Actively develop and maintain data pipelines, ETL/ELT processes, and event-streaming solutions using Apache Kafka, Apache Flink, Apache Spark, or similar tools, as well as AI-specific data systems.
  • Database Management: Manage and optimize SQL, NoSQL, OLAP and vector databases to ensure high availability, scalability, and performance, leveraging deep knowledge of database internals, mastery of concepts such as partitioning, sharding, embeddings, distributed database systems, and change data capture (CDC) techniques to drive efficiency and reliability across complex, large-scale environments.
  • Data Integration: Build real-time and batch data pipelines that integrate structured and unstructured data from various sources, including AI models and third-party data sources.
  • Performance Tuning: Continuously monitor and optimize data systems for performance, ensuring that AI workloads are supported by highly efficient data pipelines and storage solutions.
  • Collaboration: Work closely with product managers, software engineers, and data scientists to align event-driven architectures, vector databases, and data pipelines with the needs of AI and machine learning models.
  • Cloud Architecture: Architect and manage cloud-based data solutions AWS preferred, that support distributed data processing, AI workloads, and real-time data streaming.
  • Vector Databases: Design and implement vector-based databases (e.g., Pinecone, pg_vector, Milvus) to support machine learning models, including Generative AI applications, efficiently handling high-dimensional data such as embeddings and unstructured data.

Requirements:

  • Bachelor’s degree in Computer Science, Mathematics, Electrical Engineering, or equivalent knowledge and experience.
  • 7+ years’ experience in Data Architect or in a similar data engineering role, with direct involvement in designing and implementing event-driven architectures.
  • Expertise in vector databases (e.g., Pinecone, Weaviate, Milvus) and their application in Generative AI and other machine learning models, including managing high-dimensional data and embeddings.
  • Strong understanding of Generative AI applications and how to build data pipelines and infrastructure to support them.
  • Proficiency in programming (Python, Java, or similar languages) with the ability to write clean, efficient code for event-driven data pipelines and AI-driven data architectures.
  • Experience with real-time data streaming, ETL/ELT processes, and tools like Apache Kafka, Apache Flink, Kinesis, etc.
  • Extensive experience with cloud-based data architectures and distributed systems.
  • Deep understanding of database technologies (SQL, NoSQL, OLAP, vector) and performance optimization for AI workloads.
  • Strong problem-solving skills and a hands-on approach to addressing technical challenges.
  • Experience in the SaaS industry or building scalable data systems for AI-powered products.

Preferred Qualifications:

  • Experience in the Utility industry is a plus.
  • Familiarity with modern data visualization tools (e.g., Tableau, Looker) and BI platforms

Production Support/On-Call Duties:
As a key member of our engineering team, you will address escalated production issues from customer support. Your responsibilities will include:

  • Participating in a rotational on-call schedule to handle significant production issues.
  • Rapidly diagnosing and resolving technical challenges that arise in production.
  • Collaborating with customer support and engineering teams for seamless issue resolution.
  • Maintaining clear communication and documentation during and after incidents.
  • Leveraging these experiences to contribute to continuous process improvement.

Compensatory Time for On-Call Work:
We value work-life balance and recognize the extra effort required during on-call rotations. For hours spent actively working on-call, compensatory time off is provided, unless the law requires otherwise. This ensures your commitment is appropriately acknowledged. Please coordinate with your manager regarding the approval and scheduling of compensatory time, to align with team needs and workload.

Your contribution is essential in maintaining the smooth operation of our systems and in upholding high standards of customer satisfaction.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Communication
  • Problem Solving

Data Architect Related jobs