Our mission at Progenity is simple: to help healthcare providers and patients prepare for life. We provide the most advanced molecular technology and the highest levels of service to guide patient care at critical life stages. We continually seek people with the motivation and skills to advance our mission.
Reporting to the Sr. Director of Scientific Computing, the Director of Data Systems and Science’s focus is the management and integration of scientific data sets, and the design, implementation, and management of a data science platform to support the overall organization . This role will work closely and collaborate deeply with our customers to design and implement a life sciences data lifecycle management process that enables the organization to effectively catalog and life-cycle a multi-petabyte working set of life sciences data. This role will also partner closely with our science teams to develop a scalable self-service data science platform to support our scientific customers as well as the overall organization. Working across the organization, this role will identify and act on opportunities to continuously improve our processes towards operational efficiency with an emphasis on life sciences and related data. Leveraging experience in clinical life sciences, this role will partner across IT to support initiatives to incorporate and gain value from life sciences data within enterprise systems.
REQUIREMENTS
- Develops, implements, and maintains a data lifecycle and archiving process for a multi-petabyte working set of life sciences data stored across multiple platforms.
- Leverages knowledge of genomics, life sciences data, and metadata to integrate vast data sets into the larger IT enterprise architecture and systems.
- Designs, implements, and operates a self-service focused data science platform supporting the organization and enabling Machine Learning, and in depth analytics of scientific and other data sets.
- Leads and builds a team of engineers developing new capabilities for life sciences data and data science within the organization.
- Supports the overall IT enterprise architecture.
- Leverages experience to implement best practices for automation and service delivery and to drive continuous improvement of existing life sciences data systems.
- Contributes to a DevOps environment through tools, methodology, and systems experience.
- Identifies opportunities for continuous improvement within company with an emphasis on data structures, platforms, integration, and science and work to drive those forward and achieve strong business value through those efforts.
This list of duties and responsibilities is not all inclusive and may be expanded to include other duties and responsibilities, as deemed necessary.
RESPONSIBILITIES
- Production experience managing genomic data at scale within a regulated clinical environment.
- Extensive working knowledge of AWS and GCP platforms.
- At least 3 years supporting genomic or other life sciences customers in production.
- Fluent in Python.
- Familiar with Machine Learning concepts, frameworks and algorithms such as linear and logistic regression, recommendation systems, gradient tree boosting, anomaly detection, Tensorflow, Keras, or other machine learning frameworks.
- Working knowledge of Jupyterhub/Jupyter Notebooks, Zeppelin Notebooks, or other data science and analytics platforms.
- Experience operating in a regulated, clinical environment and implementing systems that follow GxP practices.
- Self-motivated and can work autonomously to seek out process improvement, efficiency gains, and to identify opportunities to deliver improved customer satisfaction and value to the organization.