Data Science Intern

Description

At BioAgilytix, we are passionate about premier science and the impact it has on our world. Our team of highly experienced scientists and professionals deliver tailored services for supporting new medicine breakthroughs with best-in-class bioanalytical services. We are tirelessly committed to our customers by being solution-oriented and deadline-driven. . . and we are growing. Our culture is fast-paced, fun and never boring. Because we work across numerous clients and drug modalities, your career can develop rapidly. You’ll gain experience with a variety of challenges all while you enable life-changing, life-saving therapeutics to the patients who need them.  The Data Science Intern will support initiatives to modernize how bioanalytical and statistical data are organized, standardized, and analyzed across the organization. This role will focus on aggregating historical statistical inputs and outputs (e.g., Excel, SAS JMP, and LIMS datasets used in immunogenicity cut point determination and other statistical analyses), transforming them into standardized data structures, and helping establish a centralized data framework that enables future analytics, visualization, and machine learning applications. Working closely with statisticians, scientists, and operational leaders, the intern will contribute to building foundational data infrastructure that improves data accessibility, consistency, and long-term analytical capability. This is a Summer Internship with expectations of no more than 30-40 hours worked per week. Applicants must currently be enrolled in a College/University degree program majoring in a relevant quantitiative or technical field (e.g. Data Science, Bioinformatics, etc..) Internship timeline is expected from end of May - August 2026.

Essential Responsibilities: Aggregate historical statistical datasets generated by the in-house biostatisticians, including Excel, SAS JMP, and other structured data sources used in bioanalytical workflows

Design and implement processes to standardize data structures, variable definitions, and metadata across disparate datasets

Develop scripts or pipelines (e.g., Python or R) to clean, transform, and consolidate data into a centralized dataset or database

Work with subject matter experts in immunogenicity bioanalysis and statistics to interpret existing data structures and ensure accurate representation of scientific context

Build reusable code or workflows to support ongoing ingestion of new statistical outputs as projects progress

Assist with development of data dictionaries, schema definitions, and documentation describing standardized datasets

Explore potential analytical or visualization use cases for the centralized dataset, such as: dashboards, exploratory analyses, statistical summaries, machine learning prototypes

Present findings and progress updates to stakeholders and leadership

Minimum Preferred Qualifications: Education & Experience Rising junior or senior in a 4-year college or university majoring in – or anticipating graduation from – a Bachelor’s or Master’s degree in a relevant quantitative or technical field (e.g. Data Science, Bioinformatics, etc.)

Minimum Preferred Qualifications: Skills Programming experience in Python or R, including use of data analysis libraries (e.g., pandas, tidyverse)

Experience working with structured datasets (CSV, Excel, relational databases)

Familiarity with data cleaning, transformation, and feature engineering

Basic knowledge of SQL or relational database concepts

Experience with data visualization tools (e.g., Plotly, Tableau, Power BI, matplotlib, seaborn) is a plus

Exposure to machine learning workflows is helpful, but not required

Ability to identify inconsistencies in data and design solutions to standardize them

Strong attention to detail when working with complex datasets

Ability to translate loosely structured datasets into clean analytical formats

Strong communication skills

Ability to work with scientists, statisticians, and operational stakeholders to understand data requirements

Comfort working with partially structured or legacy datasets

Supervision Received: Frequent supervision and instructions

Infrequently exercises discretionary authority

Working Environment: Primarily office (i.e., blend of office and work-from-home)

Routinely uses standard office equipment such as computers, phones, photocopiers, and filing cabinets

Physical Demands: Ability to work in an upright and/or stationary position for up to eight (8) hours per day

Repetitive hand movement of both hands with the ability to make fast, simple, repeated movements of the fingers, hands, and wrists to operate lab equipment

Frequent mobility needed

Frequent crouching, stooping, with frequent bending and twisting of upper body and neck

Light to moderate lifting and carrying (or otherwise moves) objects including laboratory equipment, laboratory supplies, and laptop computer with a maximum lift of 20 pounds

Ability to access and use a variety of computer software

Ability to communicate information and ideas so others will understand, with the ability to listen to and understand information and ideas presented through spoken words and sentences

Frequently interacts with others to obtain or relate information to diverse groups

Requires multiple periods of intense concentration

Performs a wide range of variable tasks as dictated by variable demands and changing conditions with little predictability as to the occurrence

Ability to perform under stress and multi-task

Regular and consistent attendance

Position Type and Expected Hours of Work: This position is temporary with expectations of no more than 30 -40 hours worked per week for 12 weeks.

This position is not benefit eligible.

Some flexibility in hours is allowed, but the employee must be available during the “core” work hours as published in the BioAgilytix Employee Handbook.

Details

Location
Durham, NC
Term
Summer 2026
Posted
3/14/2026