Hybrid Details
: On-site 3+ days per week in Reading, PA (Local only).
Duration
: 28 weeks to start
Job Description:
Team Leadership & Strategy
- Lead and mentor a team of data architects and engineers.
- Define and drive the data architecture strategy aligned with business goals.
- Collaborate with stakeholders across engineering, product, and business teams.
- Design scalable, secure, and high-performance data architectures.
- Develop data models, data flow diagrams, and system integration strategies.
- Ensure architectural consistency and best practices across projects.
- Oversee the end-to-end implementation of data solutions using Spark, Flink, and AWS.
- Manage project timelines, deliverables, and quality assurance.
- Optimize data processing workflows for performance and cost-efficiency.
- Stay current with emerging technologies and industry trends.
- Evaluate and recommend tools and frameworks to enhance data capabilities.
- Promote automation and CI/CD practices in data engineering workflows.
Role Overview:
We are looking for a seasoned Lead Data Architect with deep hands-on expertise in designing and delivering event-driven architectures and real-time streaming systems. The ideal candidate will have extensive experience with Apache Kafka, Apache Spark Structured Streaming, Apache Flink, and messaging queues, and a strong background in building highly resilient IoT data platforms on AWS.
Key Responsibilities:
Architecture & Design
- Architect and implement scalable, fault-tolerant, and low-latency data pipelines for real-time and batch processing.
- Design event-driven systems using Kafka, Flink, and Spark Structured Streaming.
- Define data models, schemas, and integration patterns for IoT and telemetry data.
- Lead the technical direction of the data engineering team, ensuring best practices in streaming architecture and cloud-native design.
- Provide hands-on guidance in coding, debugging, and performance tuning of streaming applications.
- Collaborate with product, engineering, and DevOps teams to align data architecture with business needs.
- Build and deploy real-time data processing solutions using Apache Flink and Spark Structured Streaming.
- Integrate messaging systems (Kafka, Kinesis, RabbitMQ, etc.) with cloud-native services on AWS.
- Ensure high availability, scalability, and resilience of data platforms supporting IoT and telemetry use cases.
- Continuously evaluate and improve system performance, latency, and throughput.
- Explore emerging technologies in stream processing, edge computing, and cloud-native data platforms.
- DevOps,CI/CD, and infrastructure-as-code practices.
- Mandatory Expertise:
- Apache Flink (real-time stream processing)
- Apache Spark Structured Streaming
- Apache Kafka or equivalent messaging queues (e.g., RabbitMQ, AWS Kinesis)
- Event-driven architecture design
- AWS services: S3, Lambda, Kinesis, EMR, Glue, Redshift
- Additional Skills:
- Strong programming skills in Pyspark, Java, or Python
- Experience with containerization (OpenShift)
- Familiarity with IoT protocols and resilient data ingestion patterns
- Knowledge of data lake and lakehouse architectures(Iceberg) S3 storage
- Experience in building large-scale IoT platforms or telemetry systems.
- AWS Certified Data Analytics or Solutions Architect.
- Process Flows
- Mentor and Knowledge transfer to client project team members
- Participate as primary, co and/or contributing author on any and all project deliverables associated with their assigned areas of responsibility
- Participate in data conversion and data maintenance
- Provide best practice and industry specific solutions
- Advise on and provide alternative (out of the box) solutions
- Provide thought leadership as well as hands on technical configuration/development as needed.
- Participate as a team member of the team
- Perform other duties as assigned.