Identifying piRNA targets on mRNAs in C. elegans using a deep multi-head attention network


Related paper:

Tzu-Hsien Yang, Sheng-Cian Shiue, Kuan-Yu Chen, Yan-Yuan Tseng and Wei-Sheng Wu*, "Identifying piRNA targets on mRNAs in C. elegans using a deep multi-head attention network". (Submitting)


Available Datasets:

  1. The positive and the negative sets from the wild-type CLASH data
  2. The positive and the negative sets from CSR-1 depleted CLASH data

Model Codes for the Proposed Method

Suggested running environments: Linux Ubuntu 14.04, Python 3.8.0

Steps to use the codes: (Also found in the README.txt file)
  1. Download the codes from the following link: Download.
  2. Unzip the file:
    unzip predict_code.zip
  3. Change the working directory into the model codes:
    cd predict_code
  4. Install the necessary packages:
    pip install -r requirements.txt
  5. Prepare the piRNA-mRNA sequence pairs in the folder name "Input".
    Multiple piRNA-mRNA pairs are allowed to be in the same input file. The input format should followed the following examples:
    <piRNA sequence>,<mRNA sequence>
    Note: The input piRNA sequences and mRNA sequences should all start from the 5' end.
    TGGTGGATCGTCATTTGGTGG,CGAACACCCACCACCTTGTGAACCGCCTTGT
    TTCTCATCCGGTCCAAGAGGT,TAGACCTCGACGGACTTACTGGCGACCTTCC
  6. Predict the binding probability of the given piRNA sequences and mRNA sequence segments:
    python main.py --input_file_name <Input_file_name>
  7. Output: Probability of the binding events for each pairs will be written to 'Output_<Input_file_name>.csv'.
Demonstrating Example:
python main.py --input_file_name Demo.csv
Results: (wrriten to 'Output_Demo.csv')
If the given piRNA binds to the given mRNA segment:
1
0