Skip to main content

Table 3 Random forest analysis of plasticity in seminal fluid composition

From: Sperm competition risk drives plasticity in seminal fluid composition

Protein Name Importance Importance T-Test
95 % CI
    p q
P18242 Cathepsin D 46.398 45.14 - 47.63 0.000 0.002
Q64356 SVS 6 30.231 28.99 - 31.76 0.003 0.038
P48036 Annexin A5 28.445 26.60 - 29.61 0.004 0.038
Q61400 CEACAM 10 12.745 10.64 - 14.67 0.019 0.088
P09411 Phosphoglycerate kinase 12.138 10.16 - 13.86 0.025 0.096
P30933 SVS 5 10.058 7.95 - 12.80 0.014 0.084
O08709 Peroxiredoxin-6 9.683 7.75 - 12.02 0.011 0.082
Q09098 PATE 4 / SVS 7 6.589 4.52 - 8.51 0.033 0.114
Q8CEK3 Spikl 6.043 3.56 - 8.09 0.020 0.088
P07759 Serine protease inhibitor A3K 5.887 3.21 - 7.92 0.062 0.138
P21460 Cystatin-C 4.772 2.65 - 7.28 0.061 0.138
P45376 Aldose reductase 4.673 2.19 - 6.88 0.717 0.793
P12032 Metalloproteinase inhibitor 1 2.288 −0.17 - 4.98 0.160 0.261
Q8BZH1 Transglutaminase 4 2.171 −0.16 - 4.01 0.992 0.992
P01887 Beta-2-microglobulin 2.140 −0.15 - 4.67 0.040 0.123
Q6WIZ7 SVS 1 1.240 −1.16 - 3.92 0.052 0.135
Q3SXH3 SVA 0.291 −1.48 - 2.25 0.111 0.214
P35700 Peroxiredoxin-1 0.199 −2.11 - 2.15 0.649 0.775
P18419 SVS 4 −0.169 −2.44 - 2.50 0.050 0.135
Q62216 SVS 2 −0.467 −2.70 - 2.07 0.070 0.144
Q9QY48 Deoxyribonuclease-2-beta −0.644 −2.24 - 0.99 0.125 0.228
Q8VI13 SVS 3 −1.856 −3.77 - 0.16 0.135 0.232
Q01768 Nucleoside diphosphate kinase B −1.949 −3.71 - 0.30 0.314 0.482
P81117 Nucleobindin-2 −2.434 −4.80 - 0.14 0.419 0.564
P20029 78 kDa glucose-regulated protein −2.990 −5.36 - -1.08 0.446 0.576
P14152 Malate dehydrogenase −3.549 −5.83 - -1.85 0.822 0.879
Q8BND Sulfhydryl oxidase −4.700 −6.64 - -2.85 0.327 0.482
P08228 Superoxide dismutase −5.600 −7.48 - -3.23 0.370 0.522
Q07235 Glia-derived nexin −5.685 −7.88 - -3.58 0.650 0.775
P07724 Serum albumin −5.803 −8.10 - -3.49 0.913 0.943
P09036 Spink 3 −6.798 −8.7 - -5.00 0.703 0.793
  1. The table lists variable importance scores for each secreted seminal vesicle protein from the RF model used to classify samples according to sperm competition risk experienced by subject males. Proteins that differ between treatments will make a greater contribution to the accurate classification of samples in RF models and thus have higher variable importance scores. Proteins are arranged in descending order of variable importance. The first seven proteins listed are defined as being important for classifying samples according to the sperm competition risk treatment. These are proteins that have a score greater than 6.798, which is the absolute value of the lowest variable importance score of all proteins. The results of multiple t-tests are also provided for comparison to the results of RF models. P-values resulting from t-tests were corrected for multiple comparisons by the FDR using the Benjamini-Hochberg method to yield q-values.
  2. Abbreviations: SVS seminal vesicle secretory protein, CEACAM carcinoembryonic antigen-related cell adhesion molecule, Spikl serine protease inhibitor kazal-like, PATE4 prostate and testes expressed protein, Spink serine protease inhibitor kazal-type, RF Random Forest, FDR false discovery rate)