BioDArt - Catalogue of biological data artifact examples

Anitha Veeramani, Kavitha Gopalakrishnan, Vladimir Brusic, Judice L.Y. Koh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Information in biological data repositories continues to grow exponentially due to the increasing genomic and proteomic sequencing projects. As with any database, these data repositories are subjected to data quality issues related to correctness, uniformity, completeness, redundancy, among others. Data cleaning is a prerequisite to prevent the interference of low quality data with the accuracy of data mining and analysis. This in turn involves the detection and resolution of data artifacts (errors, discrepancies, redundancies, ambiguities, and incompleteness). Understanding the causes of data artifacts and systematically classifying them are critical towards their elimination in molecular sequence databases. This paper highlights eight data artifacts found among public molecular databases. Examples of major molecular sequence database records containing these artifacts are collected into the BioDArt catalogue (http://antigen.i2r.a-star.edu.sg/BioDArt).

Original languageEnglish
Title of host publicationICBPE 2006 - Proceedings of the 2006 International Conference on Biomedical and Pharmaceutical Engineering
Pages324-329
Number of pages6
DOIs
Publication statusPublished - Dec 1 2006
EventICBPE 2006 - 2006 International Conference on Biomedical and Pharmaceutical Engineering - Singapore, Singapore
Duration: Dec 11 2006Dec 14 2006

Publication series

NameICBPE 2006 - Proceedings of the 2006 International Conference on Biomedical and Pharmaceutical Engineering

Other

OtherICBPE 2006 - 2006 International Conference on Biomedical and Pharmaceutical Engineering
CountrySingapore
CitySingapore
Period12/11/0612/14/06

    Fingerprint

Keywords

  • Data artifacts
  • Data cleaning
  • Data quality

ASJC Scopus subject areas

  • Biomedical Engineering
  • Pharmacology (medical)
  • Pharmacology, Toxicology and Pharmaceutics(all)

Cite this

Veeramani, A., Gopalakrishnan, K., Brusic, V., & Koh, J. L. Y. (2006). BioDArt - Catalogue of biological data artifact examples. In ICBPE 2006 - Proceedings of the 2006 International Conference on Biomedical and Pharmaceutical Engineering (pp. 324-329). [4155917] (ICBPE 2006 - Proceedings of the 2006 International Conference on Biomedical and Pharmaceutical Engineering). https://doi.org/10.1109/ICBPE.2006.348608