Identifying piRNA targets on mRNAs in C. elegans using a deep multi-head attention network
Related paper:
Tzu-Hsien Yang, Sheng-Cian Shiue, Kuan-Yu Chen, Yan-Yuan Tseng and Wei-Sheng Wu*, "Identifying piRNA targets on mRNAs in C. elegans using a deep multi-head attention network". (Submitting)
Available Datasets:
- The positive and the negative sets from the wild-type CLASH data
- The positive and the negative sets from CSR-1 depleted CLASH data
Model Codes for the Proposed Method
Suggested running environments: Linux Ubuntu 14.04, Python 3.8.0
Steps to use the codes: (Also found in the README.txt file)
- Download the codes from the following link: Download.
-
Unzip the file:
unzip predict_code.zip
-
Change the working directory into the model codes:
cd predict_code
-
Install the necessary packages:
pip install -r requirements.txt
-
Prepare the piRNA-mRNA sequence pairs in the folder name "Input".
Multiple piRNA-mRNA pairs are allowed to be in the same input file. The input format should followed the following examples:
<piRNA sequence>,<mRNA sequence>Note: The input piRNA sequences and mRNA sequences should all start from the 5' end.TGGTGGATCGTCATTTGGTGG,CGAACACCCACCACCTTGTGAACCGCCTTGT
TTCTCATCCGGTCCAAGAGGT,TAGACCTCGACGGACTTACTGGCGACCTTCC
-
Predict the binding probability of the given piRNA sequences and mRNA sequence segments:
python main.py --input_file_name <Input_file_name>
- Output: Probability of the binding events for each pairs will be written to 'Output_<Input_file_name>.csv'.
Demonstrating Example:
python main.py --input_file_name Demo.csv
Results: (wrriten to 'Output_Demo.csv')
If the given piRNA binds to the given mRNA segment:
1
0
1
0