Unlocking Genomes

How High School Students Are Doing Real Scientific Research

High school students in Western New York are using the same tools as scientists to annotate genes and contribute to genomic research.

Introduction: The Classroom Meets Cutting-Edge Science

In a quiet laboratory, a scientist analyzes genetic sequences using sophisticated bioinformatics tools. Meanwhile, in classrooms across Western New York, high school students are doing the exact same work — annotating genes and making discoveries about bacterial DNA using professional-grade research tools.

Thanks to an innovative educational program, hundreds of high school students have transitioned from passive science learners to active genomic researchers. They're part of the Western New York Genetics in Research Partnership (WNYGRP), a program that demonstrates age is no barrier to contributing to meaningful scientific research.

Program Impact
Students Reached ~2,000
Teachers Trained 74
Research Participants 343

What Is Gene Annotation? Decoding Life's Blueprint

Before understanding how high school students are doing genomic research, we need to answer a fundamental question: what is gene annotation?

Think of a genome as an encyclopedia without any punctuation or chapter headings — just endless strings of letters representing DNA sequences. Gene annotation is the process of identifying where genes start and stop, what functions they perform, and how they're regulated. It's like adding the punctuation, headings, and explanatory notes that transform meaningless strings of letters into understandable information.

"The idea of exposing students to real science was very enticing to me" — and gene annotation represents science at its most real and impactful 2 .

Gene Annotation Process
Identify Gene Locations

Find where genes start and stop in the DNA sequence

Determine Gene Function

Predict what proteins the genes encode and their roles

Analyze Regulation

Understand how gene expression is controlled

Database Submission

Contribute findings to genomic databases

The GENI-ACT Program: Bringing Research to the Classroom

The Genomics Education National Initiative-Annotation Collaboration Toolkit (GENI-ACT) program has created a three-part educational pipeline that introduces both teachers and students to authentic genomic research:

1. Teacher Professional Development

During summer workshops, teachers receive intensive training in genome annotation through nine customized modules covering everything from basic DNA sequence analysis to predicting protein functions and cellular localization 2 .

These workshops equip educators with both the content knowledge and practical skills needed to guide their students.

2. Student Research Projects

During the school year, students work in groups to annotate specific genes from the bacterium Kytococcus sedentarius, using the same GENI-ACT platform originally developed for undergraduate education 1 2 .

Under their teachers' guidance, they apply the nine modules to understand their assigned gene's structure and function.

3. Capstone Symposium

The program culminates in a poster symposium where students present their findings to peers and scientists, transforming them from passive learners into confident young researchers capable of discussing their work with the scientific community 4 .

Inside the Student Scientist Experience: A Step-by-Step Research Journey

So what does a student researcher actually do in the GENI-ACT program? Let's follow the research process step by step:

The Methodology: Nine Modules to Discovery

Students work through a series of bioinformatics analyses, each revealing different aspects of their gene's function and characteristics 2 . The comprehensive nature of this approach ensures students develop a complete picture of their gene's role.

Module Category Specific Tools Used Research Question Addressed
Basic Information DNA Coordinates, Protein Sequence What is the gene's sequence and location?
Sequence Similarity BLAST, COG, T-Coffee, WebLogo How similar is it to other known proteins?
Structural Similarity TIGRFAM, Pfam, PDB What functional domains does it contain?
Cellular Localization TMHMM, SignalP, PSORTb Where is the protein located in the cell?
Enzymatic Function KEGG, MetaCyc, E.C. Numbers What biological processes is it involved in?
Gene Duplication Paralog, Pseudogene analysis Are there related genes in the same genome?
Evolutionary History Phylogenetic Trees, GC Content How did the gene evolve?
RNA Analysis Rfam Does it encode functional RNA?
Final Annotation Data synthesis from all modules Has the gene been correctly identified?
Forming and Testing Hypotheses

A crucial part of the student experience involves evaluating the quality of automated gene predictions. As one program description explains: "During the school year, students were asked to evaluate the data they had collected, formulate a hypothesis about the correctness of the computer pipeline annotation, and present the data to support their conclusions" 1 .

This process teaches critical scientific thinking — students aren't just accepting computational results but acting as true scientists by questioning, verifying, and refining existing annotations.

Presenting Research Findings

The experience culminates with students creating scientific posters and presenting their findings at a capstone symposium. One participating student captured the program's impact:

"The program taught me a lot about genes and has opened my mind to consider a career/major in genetic engineering" 4 .

The Scientist's Toolkit: Essential Resources for Genomic Research

What tools do these student researchers use to conduct their work? The GENI-ACT platform provides an integrated suite of bioinformatics resources that mirror those used in professional laboratories worldwide.

Tool Name Type of Analysis Function in Research
BLAST Sequence similarity Finds regions of similarity between biological sequences
Pfam Protein families Identifies functional domains and protein families
SignalP Protein localization Predicts presence of signal peptide sequences
TMHMM Membrane topology Predicts transmembrane helices in proteins
T-Coffee Sequence alignment Creates multiple sequence alignments
KEGG Metabolic pathways Maps genes to biological pathways and systems
PDB (Protein Data Bank) 3D structure Provides information about protein three-dimensional structures
Rfam RNA families Identifies non-coding RNA genes and families

These tools are freely available online, making sophisticated genomic research accessible to educational institutions with limited budgets 2 . As the program developers note, this means "only computer and internet access were needed to take part in the project" 2 .

BLAST

Basic Local Alignment Search Tool - finds regions of similarity between biological sequences.

BLAST Details

BLAST compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. Students use it to find similar sequences in public databases.

Pfam

A database of protein families - identifies functional domains in protein sequences.

Pfam Details

Pfam contains information about protein domains and families. Students use it to classify proteins into families and predict domain organization.

Measurable Success: Program Impact on Students and Teachers

The GENI-ACT program isn't just an interesting activity — it's a rigorously evaluated educational intervention with demonstrated results.

In one study, 667 students were randomized into either an intervention group (that received GENI-ACT training) or a comparison group (that participated in alternative activities) 2 . This experimental design allowed researchers to measure the program's specific impact.

Outcome Category Specific Achievements Evidence
Student Learning Increased content knowledge in genomics and bioinformatics Pre/post surveys showed significant gains 1
Student Engagement Increased confidence in using scientific tools and processes Students reported greater comfort with scientific research 1
Career Development Increased interest in STEM and bioscience careers Student comments indicated new career considerations 4
Teacher Capacity 74 teachers trained over 3 years Expanded reach across multiple schools 2
Research Output Thousands of student gene annotations Contributions to genomic database 2
Student Reach Over Three Years
Program Participation

Perhaps most importantly, the program has successfully reached nearly 2,000 high school students over three years through various activities, with 343 students participating in the intensive GENI-ACT annotation research 2 .

Conclusion: The Future of Science Education Is Here

The success of the GENI-ACT program in high schools demonstrates that young students are fully capable of participating in authentic scientific research when given proper tools and guidance. As the program evaluation concluded, "high school students are capable of using the same tools as scientists to conduct a real-world research task" 1 .

This initiative represents more than just an engaging classroom activity — it's a pipeline for developing the next generation of scientists, doctors, and informed citizens who will need genomic literacy to navigate the future of healthcare and biotechnology.

"The Western New York Genetics was a blast, and I would gladly recommend this program to any high school students partaking in STEM careers" 4 .

In classrooms where DNA sequences become puzzles to solve rather than facts to memorize, science education transforms from memorization to authentic discovery.

References