DAVID was originally developed in 2003 by the Laboratory of Human Retrovirology and Immunoinformatics (LHRI) at the Frederick National Laboratory for Cancer Research. The primary goal was to solve a common bottleneck: functional annotation dispersion. Traditionally, a researcher had to manually visit 10 different databases (e.g., GO, KEGG, InterPro) to understand a gene list. DAVID aggregated these resources into a single platform.
The most significant milestone came with the release of DAVID v6.8 (the legacy version) and the subsequent upgrade to DAVID v2021 (or v2022/2023 updates) . The latest versions introduced modernized interfaces, updated backend databases, and significantly improved algorithmic accuracy, moving away from old statistical methods to more robust Fisher’s Exact tests and EASE scores.
For the uninitiated, here is a standard workflow for analyzing a list of differentially expressed genes (DEGs) from an RNA-seq experiment. david bioinformatics resources
Step 1: Upload
Navigate to david.ncifcrf.gov. Paste your gene list (e.g., a column of 200 gene symbols) into the upload window. Select the correct identifier type (e.g., "OFFICIAL_GENE_SYMBOL"). Choose the list type ("Gene List").
Step 2: Define Background You must specify the "background" or "universe." For most experiments, the default is the whole genome of your selected species (e.g., Homo sapiens). However, for custom arrays or targeted sequencing, you can upload a custom background list to avoid false positives. DAVID was originally developed in 2003 by the
Step 3: Select Species Choose your organism (Human, Mouse, Rat, Fly, Yeast, etc.). DAVID supports a wide range of model organisms.
Step 4: Run Functional Annotation Tool Click "Functional Annotation Tool." A results dashboard will appear. The most important section is the Functional Annotation Clustering. Click "Functional Annotation Clustering Report." DAVID aggregated these resources into a single platform
Step 5: Interpret Results Examine the clusters. A Cluster Enrichment Score > 1.3 is typically considered significant, but scores > 2.0 or > 3.0 indicate very strong biological relevance. Click on each cluster to expand it and see the individual annotation terms (GO terms, KEGG pathways, etc.) along with their raw p-values, Bonferroni-corrected p-values, and Benjamini-Hochberg FDR values.
DAVID is not just a single tool; it is an integrated ecosystem of resources. Its power lies in its ability to aggregate over 90 different annotation databases into a single, user-friendly platform. Here are its critical components.