A pipeline design for downloading and analysing promoter sequences in Solanum lycopersicum
AbstractA pipeline architecture is implemented to automatize gene promoter sequence download from tomato genome Solanum lycopersicum annotated in Sol Genomics Network. Output gene promoters can be analyzed with MEME and TOMTOM programs. The code is available at www.github.com/lalebot/pip-prom-tom and Git is used as control versions software. Combined Python threads, regular expressions, and SQLite databases are used to reduce time for downloading sequences and optimize informatic resources. The methodology presented in this work is potentially applicable to other biological fields.