Need help parsing a text file

Status
Not open for further replies.
I have very little coding skills, so I can't write a program for this. I have looked for software, but couldn't find any (for OS X).

Anyway, here's my problem. I have a bunch (~1000) of DNA sequences that look similar to this:

DNA Sequence said:
GTTAAAGACGTGATCAAGGGTCTGGCCTGCGCAGAGCCCGAACTCGTTGAGATTGATACCGACATTCTTATGGTCGGTGGTGGTATGGGTAACTGCGGTACTGCTTTTGAAGCAGTGCGCTGGGCCGACAAAGTTGGCGGCGACATCAAGATCCTCTTGGTTGACAAGGCTGCTGTCGATCGCGGCGGCGCGGTTGCTCAGGGTCTTTCCGCCATCAACACCTATATCGGCGAGAACGATGTTGACGACTATGTCCGCATGGTCCGCACCGACCTCATGGGTATTGTTCGCGAAGACTTGATCTACGACCTTGGCCGCCACGTGGATGACTCCGTTCATCTGTTCGAAGAATGGGGCCTGCCTTGCTGGGTCAAGAAAGACGGCAAGAACCTGGACGGCGCTCA
And, I have a list of primers (which are short sequences of DNA used to bind to the above sequences) that I need to search the DNA sequences.

Right now, I am just using a text editor (TextWrangler) and I am using the search option to search with each primer one at a time. It's tedious to say the least.

I need to know what the percentage of matches for each primer against the DNA sequence file.

If anyone knows a way to batch search or if there is a program that can do this sort of thing, I would appreciate it!

Ex.

PrimerA
ACTGACTGACTGACTGAC - matched 343/1062 sequences

PrimerB
GTCAGTCAGTCAGTCAC - matched 10/1062 sequences
 
Can't BBEdit do this? BBEdit 7 (the last version I purchased) can do it using the 'Find All...' (instead of just 'Find...') command, I would assume the most recent version (9) has similar functionality. It's even possible that TextWrangler may have this ability (I've never used TW, since I upgraded from BBEdit Lite before [STRIKE]it was discontinued[/STRIKE] TW was even introduced).

On my computer, I could basically have each sequence saved as its own file in a folder (known number of sequences) and then tell BBEdit to batch search that [STRIKE]file[/STRIKE] folder for any particular sequence of characters. It would then search the folder and return the total number of files containing that sequence as well as a short representation of the source file with that instance highlighted. I use it for this function quite often while looking for expressions in a folder full of code snippets.

--Patrick
 
Status
Not open for further replies.
Top