Information TechnologyContractorMid-level(3-5 yrs)
Job Description
The Research Officer - Data Management will lead the data engineering efforts for a multi-country study (covering nine to ten countries). This role is critical in managing the extraction, transformation, validation, and security of large health datasets to prepare them for advanced analysis by researchers and statisticians.
Role Overview
KEMRI is seeking a professional to manage the full data engineering pipeline. The successful candidate will ensure that data flows seamlessly from health facilities across multiple jurisdictions while maintaining strict compliance with national data laws and security protocols.
Key Responsibilities
Pipeline Management: Oversee the full data engineering pipeline (extraction, transformation, validation, documentation, and version control) for all study datasets across multiple countries.
ETL & Workflow Design: Design and maintain extraction-transform-load (ETL) workflows, data dictionaries, and validation rules.
Data Quality Assurance: Implement and maintain data quality frameworks covering completeness, internal consistency, timeliness, and audit trails.
Monitoring & Reporting: Run routine DQ checks and produce facility DQ scorecards to provide feedback on data gaps.
Security & Compliance: Maintain secure country-specific servers, encryption, and anonymization processes. Manage access control registries and audit logs in compliance with national data laws.
Federated Analysis: Coordinate and support in-country teams to execute script-based federated analysis workflows using centrally developed code without exporting raw data.
Dataset Preparation: Produce quarterly clean datasets and documentation packs (including README, data schemas, and provenance) with harmonized structures ready for analysis.
Collaboration: Partner with analysts and statisticians to support data linkage, model preparation, and metadata curation.
Qualifications and Requirements
Education: Master’s degree in Biostatistics, Health Informatics, or Data Management (Mandatory).
Experience: At least 3 years of working experience managing large routine health datasets and health facility assessments (Mandatory).
Technical Proficiency: Advanced skills in SQL, Python, or R (Mandatory).
Knowledge: Expertise in data modelling, data quality frameworks, and security best practices (Mandatory).
How to Apply
Interested and qualified candidates should apply online through the KEMRI e-recruitment portal at erecruitment.kemri.go.ke. You can also view the application link via the MyJobMag portal.