Frequently Asked Questions / Contacts

1. Database query: Gene IDs used in this database (Uniprot IDs are allowed for all organisms)

Allowed entries Examples
H. sapiens Entrez gene IDs/Gene names 10092/ARPC5
S. cerevisiae ORF names/Gene names YAL041W/CDC24
S. pombe ORF names/Gene names SPAC821.12/ORB6
M. musculus Entrez gene IDs/Gene names 11695/ALX4
D. melanogaster FlyBase IDs/Gene names FBGN0011573/11573/CDC37
C. elegans WormBase IDs/Gene names WBGENE00000001/1/AAP-1
A. thaliana Locus Tag/Gene names AT1G01650/ASK20
B. subtilis ORF names/Gene names IPA-13R/SACY
B. taurus Gene names RGS9
E. coli ORF names/Gene names B3681/GLVG
R. norvegicus ORF names/Gene names RCG_22973/TBP
O. sativa ORF names/Gene names OSJ_12311/GF14F

2. Quality control

  • The repository of high-quality interactions contains only manually validated HT experiments and interactions from small-scale studies that have been reported at least twice in the literature.
  • Since the number of HT publications is relatively low as compared to the vast number of small-scale studies, we manually inspect each of the HT studies. We ensure that high-quality HT experiments included in HINT have been verified by orthogonal assays (e.g., co-immunoprecipitation). Experiments that do not perform any validation of their screens are not included.
  • On the other hand, since it is impossible to manually check all small-scale studies, we require two independent publications to report the same interaction for it to be included in our dataset. While some interactions from dedicated small-scale studies have been validated multiple times in the same publication and are of high quality, a significant fraction of interactions from small-scale experiments are not easily reproducible. One main reason is due to the fact that many small-scale publications started with a large-scale screen (e.g., pull-down mass spectrometry) for their proteins of interest, which often found dozens to hundreds of interactors. The authors then only focused on one or two of these interactions for detailed studies. As a result, the rest of these interactions contain many false positives. For interactions reported by only one publication, it is impossible to separate the high-quality ones from the rest. Therefore, to ensure high quality, we only include interactions reported by two independent publications in our dataset.
  • Protein post-translational modifications (e.g., ubiqitination, sumoylation) are not considered as protein-protein interactions.

3. Batch download file format

Each row represents an interaction, including two Uniprot IDs, two gene names, two ORF names (if available), two alias (if available) and publication list. Each publication information consists of Pubmed ID, evidence code and if it's high throughput. Multiple entries for the same item are separated by the pipe symbol (“|”).

4. Citation

Das J and Yu H. HINT: High-quality protein interactomes and their applications in understanding human disease. BMC Systems Biology, 2012 Jul 30;6(1):92.