Frequently Asked Questions (FAQ)
Find answers to common questions about PIASOmarkerDB and how to use it effectively.
What is PIASOmarkerDB?
PIASOmarkerDB is a comprehensive database of cell type marker genes with their specificity scores across various tissues, species, studies and conditions. It is powered by PIASO (Precise Integrative Analysis of Single-cell Omics) methodologies, providing standardized measures for comparing marker gene expression specificity.
What is a specificity score?
The specificity score in PIASOmarkerDB measures how specific a gene's expression is to a particular cell type compared to other cell types collected in the same study. Higher scores indicate greater specificity to that cell type. The score is calculated using PIASO methods, which enhance the accurate and efficient identification of cell type-specific markers.
What is PIASO?
PIASO is a Python toolkit for single-cell and spatial omic data analysis. Learn more at https://piaso.org.
How is the data in PIASOmarkerDB curated?
The database aggregates marker gene data from multiple published studies and datasets. Each entry includes the cell type, gene, specificity score, study/publication information, and contextual metadata such as species, tissue, and experimental condition.
How do I browse cell type contexts?
Click on "Browse Cell Types" in the navigation menu. You'll see a table listing all unique cell type contexts in the database, including cell type name, study, species, tissue, condition, and average specificity score.
What is a "cell type context"?
A cell type context is a unique combination of cell type, study, species, tissue, and condition. The same cell type (e.g., "Microglia") may appear multiple times if it's from different studies, species, or tissues.
What is the Average Specificity score?
The Average Specificity is calculated from the top 30 highest-scoring marker genes for that cell type context. It provides a summary measure of how well-defined the markers are for that specific context.
How do I view all markers for a cell type?
Click the "View Markers" button next to any cell type context in the Browse Cell Types page. This will redirect you to the Search page with filters pre-filled to show all markers for that specific context.
What is the Cell Type Map view?
The Cell Type Map is an interactive 2D visualization showing all cell types positioned based on their marker gene similarities. Cell types with similar expression profiles appear closer together. You can:
- Switch views: Toggle between Table View and Map View using the buttons at the top
- Color by: Change dot colors based on Specificity, Species, Tissue, Study, Condition, Cell Type Group, or Leiden Cluster
- Size by: Set dot sizes based on Specificity score or use uniform sizing
- Show Labels: When viewing by Cell Type Group or Leiden Cluster, enable labels to see cluster names directly on the map
- Filter: Use the dropdown filters to focus on specific subsets of cell types (non-matching cells appear in light purple)
- Select: Click on any cell type dot to select it (up to 3 selections). Selected cell types show their top 3 marker genes
- Interact: Hover over dots for details, use scroll/drag to zoom and pan
- Export: Download the visualization as PNG or PDF
What are Cell Type Groups and Leiden Clusters?
Cell Type Groups are curated annotations that categorize cell types into broader functional groups (e.g., "Astrocytes", "Hindbrain Neurons", "Striatal & Limbic Neurons"). Leiden Clusters are automatically computed clusters based on marker gene similarity using the Leiden community detection algorithm. Both have approximately 80 categories. When you color by these options, labels appear directly on the map at each cluster's center.
How are cell types positioned in the Map view?
Cell type positions are computed using UMAP (Uniform Manifold Approximation and Projection) on their marker gene specificity profiles. This technique preserves local relationships, so cell types with similar marker gene signatures cluster together. The coordinates are pre-computed and embedded in the data file.
What is the Cell Type Group column in the table?
The Cell Type Group column shows the curated functional group annotation for each cell type context. This helps you quickly identify related cell types and understand their biological categorization.
How do I search for specific markers?
Navigate to the "Search" page and use the filter dropdowns to narrow down results by:
- Gene: Search for a specific gene symbol
- Cell Type: Filter by cell type name
- Study/Publication: Filter by study identifier
- Species: Filter by species (e.g., Human, Mouse)
- Tissue: Filter by tissue type
- Condition: Filter by experimental condition
- Min. Specificity Score: Set a minimum threshold for specificity
You can combine multiple filters to narrow your search.
Why don't I see any results?
If no results appear, try:
- Using fewer filters to broaden your search
- Checking your gene symbol spelling (use uppercase)
- Lowering the minimum specificity score threshold
- Verifying that the combination of filters exists in the database
Can I download search results?
Yes! After performing a search, click the "Download Results" button at the top-right of the page to export your results as a CSV file.
How do I copy the top genes from search results?
Click the "Copy Top 50 Genes" button next to "Download Results". This copies the top 50 gene names from your current search results to your clipboard in a comma-separated format (e.g., "Gene1, Gene2, Gene3, ..."). This is useful for quickly transferring gene lists to other tools or the Analyze Genes feature.
What does the Analyze Genes tool do?
The Analyze Genes tool helps you infer potential cell types from a list of genes. Enter a comma-separated list of gene symbols, and the tool will search for these genes in the database and summarize potential cell type contexts based on matching genes and their specificity scores.
How do I use the Analyze Genes feature?
- Navigate to the "Analyze Genes" page
- Enter a comma-separated list of gene symbols in the text box
- Optionally, add exclude filters to remove unwanted results
- Click "Analyze Genes"
- Review the ranked results showing potential cell type contexts
Tip: Try the example gene lists (Example 1, 2, or 3) to see how the tool works!
How are results ranked?
Results are ranked by:
- Gene Count: Number of your input genes that match markers for that cell type context
- Average Specificity Score: Average specificity of the matched genes
Cell type contexts with more matched genes and higher specificity scores appear at the top.
What are Exclude Filters?
Exclude filters let you remove unwanted results based on cell type name, study, species, or tissue. For example, if you only want human results, you can exclude "Mouse" from the species field.
Can I click on cell types in the results?
Yes! Cell type names in the results table are clickable. Clicking a cell type will redirect you to the Search page with filters pre-filled to show all markers for that specific cell type context.
What gene symbols should I use?
Use standard gene symbols (e.g., HUGO symbols for human genes). Gene symbols are case-insensitive but will be converted to uppercase for searching. Separate multiple genes with commas, spaces, semicolons, or line breaks.
Can I download analysis results?
Yes! Click the "Download Results" button after analyzing your gene list to export the results as a CSV file.
What does the Cell Type Map show in the Analyze page?
When you analyze genes, the Cell Type Map visualizes all cell types with matching cell types highlighted. Non-matching cell types appear in light purple. You can:
- Color By: Specificity of matched genes, Species, Tissue, Study, Cell Type Group, or Leiden Cluster
- Size By: Gene Count (number of matched genes), Specificity, or Uniform
- Filter: Set minimum matched gene count and specificity thresholds
- Show Labels: Toggle cluster labels on the map when viewing by Cell Type Group or Leiden Cluster
- Select: Click on cell type dots to see detailed information and all matched marker genes
What are the matched genes shown in selection cards?
When you select a cell type on the Analyze map, the selection card shows "All Matched Genes" - these are all genes from your input list that are markers for that cell type, along with their specificity scores. This helps you understand which of your input genes contributed to the match.
Is there a Python API available?
Yes! PIASOmarkerDB is integrated into the PIASO toolkit. You can query markers, analyze gene lists, and annotate cell types programmatically.
Installation
pip install piaso-tools
Basic Usage - Query by Gene
import piaso
# Query marker genes for a single gene
marker_df = piaso.tl.queryPIASOmarkerDB(gene="Fezf2")
marker_df.head(10)
Query Multiple Genes
# Query multiple genes at once
marker_df = piaso.tl.queryPIASOmarkerDB(gene=["Fezf2", "Satb2", "Tbr1"])
marker_df.head(10)
Query with Filters
# Query with species and score filters
marker_df = piaso.tl.queryPIASOmarkerDB(
gene="Fezf2",
species="Mouse",
min_score=2.0
)
# Get both DataFrame and marker dictionary
marker_df, marker_dict = piaso.tl.queryPIASOmarkerDB(
study="AllenWholeMouseBrain_isocortex",
species="Mouse",
as_dict=True
)
# marker_dict = {'L5 IT': ['Fezf2', 'Bcl11b', ...], 'L2/3 IT': ['Satb2', ...], ...}
List Available Studies
# List all available studies
studies = piaso.tl.queryPIASOmarkerDB(list_studies=True)
print(f"Total studies: {len(studies)}")
# List studies for a specific species
mouse_studies = piaso.tl.queryPIASOmarkerDB(list_studies=True, species="Mouse")
Analyze Gene Lists
Analyze a list of genes to infer potential cell types:
import piaso
# Single gene list - cortical neuron markers
query_genes = ["Fezf2", "Satb2", "Tbr1", "Bcl11b", "Cux2", "Foxp2"]
marker_df = piaso.tl.analyzeMarkers(query_genes)
print(marker_df.head())
# Multiple gene sets (e.g., from clustering)
gene_sets = {
'L5_IT': ['Fezf2', 'Bcl11b', 'Crym', 'Pcp4'],
'L2/3_IT': ['Satb2', 'Cux2', 'Mdga1', 'Rasgrf2'],
'L6_CT': ['Tbr1', 'Foxp2', 'Syt6', 'Tle4'],
}
results, top_hits = piaso.tl.analyzeMarkers(gene_sets)
print(top_hits)
# {'L5_IT': 'L5 IT CTX', 'L2/3_IT': 'L2/3 IT CTX', 'L6_CT': 'L6 CT CTX'}
Filter by Specific Studies
You can filter analysis results by specific studies:
import piaso
# Analyze using only specific mouse brain study
results, top_hits = piaso.tl.analyzeMarkers(
gene_sets,
species="Mouse",
studies="AllenWholeMouseBrain_isocortex"
)
# Or multiple studies
results, top_hits = piaso.tl.analyzeMarkers(
gene_sets,
species="Mouse",
studies=["AllenWholeMouseBrain_isocortex", "DiBellaArlotta2021"]
)
# Invalid study names raise an error with instructions
try:
results, top_hits = piaso.tl.analyzeMarkers(genes, studies="InvalidStudy")
except Exception as e:
print(e)
# Study 'InvalidStudy' not found in PIASOmarkerDB...
COSG Integration
Integrate with COSG for automatic cluster annotation:
import cosg
import pandas as pd
import piaso
# Run COSG on your clustered data
cosg.cosg(adata, key_added='cosg', groupby='CellTypes')
# Get COSG results as DataFrame
cosg_df = pd.DataFrame(adata.uns['cosg']['names'])
# Analyze with PIASOmarkerDB
results, top_hits = piaso.tl.analyzeMarkers(
cosg_df,
n_top_genes=50, # Use top 50 markers per cluster
species="Mouse",
studies="AllenWholeMouseBrain_isocortex"
)
# Print cell type predictions
for cluster, cell_type in top_hits.items():
print(f"{cluster}: {cell_type}")
# Add annotations to AnnData
adata.obs['Tophits_piasomarkerdb'] = adata.obs['CellTypes'].map(top_hits)
# Note: "Unassigned" is returned when no matches are found
Available Functions
| Function | Description |
|---|---|
piaso.tl.queryPIASOmarkerDB() |
Main query function for markers, studies, cell types, genes |
piaso.tl.getMarkers() |
Alias for queryPIASOmarkerDB() |
piaso.tl.analyzeMarkers() |
Analyze gene lists for cell type inference (supports studies filter) |
piaso.tl.PIASOmarkerDB() |
Client class for advanced usage |
REST API Endpoints
For direct HTTP access, the following endpoints are available:
GET /api/v1/markers- Query marker genes with filtersGET /api/v1/cell-types- Get all cell type contextsGET /api/v1/genes- Get all unique genesGET /api/v1/studies- Get all studies with metadataGET /api/v1/study-sources- Get study DOI/publication links
API Query Parameters
The /api/v1/markers endpoint supports these parameters:
gene- Filter by gene symbol (comma-separated for multiple)cell_type- Filter by cell type (comma-separated for multiple)study- Filter by study/publicationspecies- Filter by speciestissue- Filter by tissuecondition- Filter by conditionmin_score- Minimum specificity scoremax_score- Maximum specificity scorelimit- Maximum results (default: no limit, max: 100000 when specified)offset- Pagination offsetformat- Response format ('json' or 'csv')
Can I download the entire database?
Yes! Navigate to the "Download" page and click "Download All Data (CSV)" to export the complete PIASOmarkerDB dataset as a CSV file.
What format is the download?
All downloads are in CSV (Comma Separated Values) format, which can be opened in Excel, R, Python, or any spreadsheet/data analysis software.
What columns are included in the download?
Downloaded data includes:
- Cell Type
- Gene Symbol
- Specificity Score
- Study/Publication
- Species
- Tissue
- Condition
Where does the data come from?
PIASOmarkerDB aggregates cell type marker gene data from published single-cell and spatial transcriptomics studies. Each study is linked to its original publication for reference and citation.
Study Publications
The following table lists all studies currently included in PIASOmarkerDB with their publication links:
How to cite data sources?
When using marker data from PIASOmarkerDB, please cite both PIASOmarkerDB and the original study publication. The study name is displayed alongside each marker in the database, and clicking on it will take you to the original publication.
Request to add a study
If you would like to contribute marker gene data from a published study, or request that a specific study be added to PIASOmarkerDB, please contact dai@broadinstitute.org.
What browsers are supported?
PIASOmarkerDB works best on modern browsers including Chrome, Firefox, Safari, and Edge (latest versions).
What technology stack is used?
PIASOmarkerDB is built with:
- Backend: Python/Flask with PostgreSQL database
- Frontend: Bootstrap 5, jQuery, DataTables
- API: RESTful JSON API
How often is the database updated?
The database is updated periodically as new studies and datasets become available. Check the home page for the most recent statistics.
Can I contribute data to PIASOmarkerDB?
If you have marker gene data from published studies that you'd like to contribute, please contact Min Dai at dai@broadinstitute.org.
Who maintains PIASOmarkerDB?
PIASOmarkerDB is developed and maintained by the Gord Fishell Laboratory at Harvard Medical School and Broad Institute of MIT and Harvard.
How do I cite PIASOmarkerDB?
If you use PIASOmarkerDB in your research, please cite:
- PIASOmarkerDB website: https://piaso.org/piasomarkerdb
- PIASO toolkit: https://piaso.org
- Reference: Wu, S.J., Dai, M. et al. Pyramidal neurons proportionately alter cortical interneuron subtypes. Nature (2026). https://doi.org/10.1038/s41586-025-09996-8
Formal PIASOmarkerDB publication and citation information will be added when available.
How can I get help or report issues?
For questions, issues, or feedback, please contact:
Min Dai - dai@broadinstitute.org
Still have questions?
If you can't find the answer you're looking for, please contact us at dai@broadinstitute.org