Data Science Intern
Description
At BioAgilytix, we are passionate about premier science and the impact it has on our world. Our team of highly experienced scientists and professionals deliver tailored services for supporting new medicine breakthroughs with best-in-class bioanalytical services. We are tirelessly committed to our customers by being solution-oriented and deadline-driven. . . and we are growing. Our culture is fast-paced, fun and never boring. Because we work across numerous clients and drug modalities, your career can develop rapidly. You’ll gain experience with a variety of challenges all while you enable life-changing, life-saving therapeutics to the patients who need them. The Data Science Intern will support initiatives to modernize how bioanalytical and statistical data are organized, standardized, and analyzed across the organization. This role will focus on aggregating historical statistical inputs and outputs (e.g., Excel, SAS JMP, and LIMS datasets used in immunogenicity cut point determination and other statistical analyses), transforming them into standardized data structures, and helping establish a centralized data framework that enables future analytics, visualization, and machine learning applications. Working closely with statisticians, scientists, and operational leaders, the intern will contribute to building foundational data infrastructure that improves data accessibility, consistency, and long-term analytical capability. This is a Summer Internship with expectations of no more than 30-40 hours worked per week. Applicants must currently be enrolled in a College/University degree program majoring in a relevant quantitiative or technical field (e.g. Data Science, Bioinformatics, etc..) Internship timeline is expected from end of May - August 2026.
Essential Responsibilities: Aggregate historical statistical datasets generated by the in-house biostatisticians, including Excel, SAS JMP, and other structured data sources used in bioanalytical workflows
Design and implement processes to standardize data structures, variable definitions, and metadata across disparate datasets
Develop scripts or pipelines (e.g., Python or R) to clean, transform, and consolidate data into a centralized dataset or database
Work with subject matter experts in immunogenicity bioanalysis and statistics to interpret existing data structures and ensure accurate representation of scientific context
Build reusable code or workflows to support ongoing ingestion of new statistical outputs as projects progress
Assist with development of data dictionaries, schema definitions, and documentation describing standardized datasets
Explore potential analytical or visualization use cases for the centralized dataset, such as: dashboards, exploratory analyses, statistical summaries, machine learning prototypes
Present findings and progress updates to stakeholders and leadership
Minimum Preferred Qualifications: Education & Experience Rising junior or senior in a 4-year college or university majoring in – or anticipating graduation from – a Bachelor’s or Master’s degree in a relevant quantitative or technical field (e.g. Data Science, Bioinformatics, etc.)
Minimum Preferred Qualifications: Skills Programming experience in Python or R, including use of data analysis libraries (e.g., pandas, tidyverse)
Experience working with structured datasets (CSV, Excel, relational databases)
Familiarity with data cleaning, transformation, and feature engineering
Basic knowledge of SQL or relational database concepts
Experience with data visualization tools (e.g., Plotly, Tableau, Power BI, matplotlib, seaborn) is a plus
Exposure to machine learning workflows is helpful, but not required
Ability to identify inconsistencies in data and design solutions to standardize them
Strong attention to detail when working with complex datasets
Ability to translate loosely structured datasets into clean analytical formats
Strong communication skills
Ability to work with scientists, statisticians, and operational stakeholders to understand data requirements
Comfort working with partially structured or legacy datasets
Supervision Received: Frequent supervision and instructions
Infrequently exercises discretionary authority
Working Environment: Primarily office (i.e., blend of office and work-from-home)
Routinely uses standard office equipment such as computers, phones, photocopiers, and filing cabinets
Physical Demands: Ability to work in an upright and/or stationary position for up to eight (8) hours per day
Repetitive hand movement of both hands with the ability to make fast, simple, repeated movements of the fingers, hands, and wrists to operate lab equipment
Frequent mobility needed
Frequent crouching, stooping, with frequent bending and twisting of upper body and neck
Light to moderate lifting and carrying (or otherwise moves) objects including laboratory equipment, laboratory supplies, and laptop computer with a maximum lift of 20 pounds
Ability to access and use a variety of computer software
Ability to communicate information and ideas so others will understand, with the ability to listen to and understand information and ideas presented through spoken words and sentences
Frequently interacts with others to obtain or relate information to diverse groups
Requires multiple periods of intense concentration
Performs a wide range of variable tasks as dictated by variable demands and changing conditions with little predictability as to the occurrence
Ability to perform under stress and multi-task
Regular and consistent attendance
Position Type and Expected Hours of Work: This position is temporary with expectations of no more than 30 -40 hours worked per week for 12 weeks.
This position is not benefit eligible.
Some flexibility in hours is allowed, but the employee must be available during the “core” work hours as published in the BioAgilytix Employee Handbook.
Details
- Location
- Durham, NC
- Term
- Summer 2026
- Posted
- 3/14/2026