Example queries

Interested in integrating FuncNet queries into your own automated workflows? Read on. If you just want to try a sample session yourself, open the web interface and click Test.

This page describes the queries you can submit to the FuncNet web services and the results you should expect from them. More information on the message formats can be found in the documentation sections within the WSDLs themselves. If you use the Perl or Java clients we supply, you won’t have to mess around with SOAP messages yourself, as all the complexity is hidden from you. Indeed, most SOAP toolkits are designed to let you invoke web services without needing to build or parse any XML. However, if you’re writing your own client, or debugging a problem, it can be handy to know what’s going on under the covers.

For brevity, the descriptions below only show the SOAP payload of each message. Each request must be wrapped in a SOAP envelope as follows:

<soapenv:Envelope
  xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:fun="http://funcnet.eu/FuncNet_1_3/">
   <soapenv:Header/>
   <soapenv:Body>
   <!-- SOAP payload goes in here -->
   </soapenv:Body>
</soapenv:Envelope>

Submitting a job to the pipeline

Use the front-end service’s SubmitTwoProteinSets operation to submit a job. You must supply two lists of human proteins (in the format of UniProt primary accessions), a query set (unknown/poorly-understood function) and a reference set (well-characterized):

<fun:SubmitTwoProteinSets>
   <queryProteins>
      <p>A3EXL0</p>
      <p>Q8NFN7</p>
      <p>O75865</p>
   </queryProteins>
   <refProteins>
      <p>Q5SR05</p>
      <p>Q9H8H3</p>
      <p>P22676</p>
   </refProteins>
   <emailAddress>nobody@example.org</emailAddress>
   <enableEmailNotify>false</enableEmailNotify>
</fun:SubmitTwoProteinSets>

You need to supply an email address, as this is used to identify you later when monitoring/retrieving a job, but you won’t receive any emails from the service unless you set enableEmailNotify to true.
Optionally, you can supply a list of predictors to use, if you don’t want to send the job through the whole pipeline. For example, this message selects the broad-coverage high-accuracy subset we recommend for general use:

<fun:SubmitTwoProteinSets>
   <predictors>
      <predictor>engineDBService</predictor>
      <predictor>iHOPFuncNetConnService</predictor>
      <predictor>iHOPFuncNetService</predictor>
      <predictor>JacopService</predictor>
   </predictors>
   <queryProteins>
      <p>A3EXL0</p>
      <p>Q8NFN7</p>
      <p>O75865</p>
   </queryProteins>
   <refProteins>
      <p>Q5SR05</p>
      <p>Q9H8H3</p>
      <p>P22676</p>
   </refProteins>
   <emailAddress>nobody@example.org</emailAddress>
   <enableEmailNotify>false</enableEmailNotify>
</fun:SubmitTwoProteinSets>

If you need a list of the available predictors, just send a simple message like this (no params required):

<fun:GetRegisteredPredictors/>

After submission, the service will respond with a unique job ID, something like this:

<fun:SubmitTwoProteinSetsResponse
   xmlns:fun="http://funcnet.eu/FuncNet_1_3/">
   <jobID>caffdubya/-3b3ce265:11fb7bc2e61:-7fff</jobID>
</fun:SubmitTwoProteinSetsResponse>

Checking a job’s status

To check the status of a job, you will need to invoke the MonitorJob operation, using your email address and unique ID to identify your job:

<fun:MonitorJob>
   <jobLocator>
      <jobID>caffdubya/-3b3ce265:11fb7bc2e61:-7fff</jobID>
      <emailAddress>nobody@example.org</emailAddress>
   </jobLocator>
</fun:MonitorJob>

This will return a status message like this:

<fun:MonitorJobResponse
    xmlns:fun="http://funcnet.eu/FuncNet_1_3/">
   <status>WORKING</status>
   <predictorStatus i="0">
      <predictor>CodaCathService</predictor>
      <status>WORKING</status>
   </predictorStatus>
   <predictorStatus i="1">
      <predictor>CodaPfamService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="2">
      <predictor>GecoService</predictor>
      <status>WORKING</status>
   </predictorStatus>
   <predictorStatus i="3">
      <predictor>HippiService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
</fun:MonitorJobResponse>

Predictors that are not currently online do not show in this report.

Cancelling a job

The message format for cancelling a job is almost identical to the previous example:

<fun:CancelJob>
   <jobLocator>
      <jobID>caffdubya/-3b3ce265:11fb7bc2e61:-7fff</jobID>
      <emailAddress>nobody@example.org</emailAddress>
   </jobLocator>
</fun:CancelJob>

If you try to cancel a job that’s already finished, you’ll get an error; otherwise you’ll get confirmation:

<fun:CancelJobResponse
    xmlns:fun="http://funcnet.eu/FuncNet_1_3/">
  <status>CANCELLED</status>
</fun:CancelJobResponse>

Retrieving relevance scores for your proteins

Once your job is COMPLETE (i.e. all the predictors have either completed successfully, or failed for some reason) you can retrieve the results in various formats. For normal usage scenarios, e.g. ranking query proteins according to their relevance to the whole reference set (like in the web interface), use RetrieveSetwiseScores:

<fun:RetrieveSetwiseScores>
   <jobLocator>
      <jobID>caffdubya/-3b3ce265:11fb7bc2e61:-7fff</jobID>
      <emailAddress>nobody@example.org</emailAddress>
   </jobLocator>
</fun:RetrieveSetwiseScores>

This returns a score element for each protein in the query set that at least one of the predictors made a prediction about (however weak):

<fun:RetrieveSetwiseScoresResponse
      xmlns:fun="http://funcnet.eu/FuncNet_1_3/">
   <predictorStatus i="0">
      <predictor>CodaCathService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="1">
      <predictor>CodaPfamService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="2">
      <predictor>engineDBService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="3">
      <predictor>GecoService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="4">
      <predictor>HippiService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="5">
      <predictor>iHOPFuncNetConnService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="6">
      <predictor>iHOPFuncNetService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="7">
      <predictor>JacopService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <predictorStatus i="8">
      <predictor>PIPsPredictorService</predictor>
      <status>COMPLETE</status>
   </predictorStatus>
   <s>
      <p1>A3EXL0</p1>
      <rs>1.593198817913301</rs>
      <pv>0.8100139995728369</pv>
      <pr i="0">
         <rs>0.0</rs>
         <pv>0.45086</pv>
      </pr>
      <pr i="7">
         <rs>0.0</rs>
         <pv>0.999999</pv>
      </pr>
   </s>
   <s>
      <p1>O75865</p1>
      <rs>1.615708908613666</rs>
      <pv>0.8059660198021762</pv>
      <pr i="3">
         <rs>0.0</rs>
         <pv>0.445814</pv>
      </pr>
      <pr i="7">
         <rs>0.0</rs>
         <pv>0.999999</pv>
      </pr>
   </s>
   <s>
      <p1>Q8NFN7</p1>
      <rs>2.000001000058178E-6</rs>
      <pv>0.999999</pv>
      <pr i="7">
         <rs>0.0</rs>
         <pv>0.999999</pv>
      </pr>
   </s>
</fun:RetrieveSetwiseScoresResponse>

Each score profile element (s) contains a p1 element which is the ID of a protein from the query set. It also contains a rs (raw score) element which is the FuncNet score for the query protein. This is a comparative measure of relevance to the reference set, by which the query proteins can be ranked, as in the web interface. The higher this score is, the more likely the protein is to share the function embodied by the reference set. It is calculated by taking the p-value of the best prediction from each predictor between the query protein and any reference protein (see below), and combining them using Fisher’s method to provide a single measure.

pv is an estimated p-value for this prediction, i.e., the chance that a randomly-chosen query protein would have scored at least as high as this pair. This is only an estimate since Fisher’s method assumes independence between the predictors, which isn’t necessarily true. Therefore we recommend you use this as a rough guide only, and don’t immediately discount hits that are slightly beyond what would be considered a normal threshold for significance.

As well as the overall rs and pv, we also the individual rs and pv from each predictor for each query protein. More specifically: for each predictor, this is the best pairwise prediction between the query protein and any reference protein. These are the scores which are used as the input to Fisher’s method to calculate the overall score, and are show in the pr elements. The i attribute for each predictor in the status section at the top corresponds to the i of each predictor in each score profile, so you can tell which predictor each score came from. As shown above, many predictions will result from the input of only a few predictors, or even just one. See below for notes about interpreting the raw scores and p-values from the predictors.

Total query time is highly dependent on the size of the query, and can take anything from a minute or two for small queries like this one, to several hours for larger queries. This is because some of the predictors have to perform complex classification tasks on-the-fly.

Retrieving protein-protein functional associations

You can also request the pairwise scores (query protein–reference protein) on which the setwise scores are based. These are available in two formats. The simplest is RetrievePairwiseScores:

<fun:RetrievePairwiseScores>
   <jobLocator>
      <jobID>caffdubya/-3b3ce265:11fb7bc2e61:-7fff</jobID>
      <emailAddress>nobody@example.org</emailAddress>
   </jobLocator>
</fun:RetrievePairwiseScores>

This query returns the scores in the form of overall pairwise functional associations between the query and reference proteins:

<fun:RetrievePairwiseScoresResponse
   xmlns:fun="http://funcnet.eu/FuncNet_1_3/">
  <predictorStatus i="0">
    <predictor>CodaCathService</predictor>
    <status>COMPLETE</status>
  </predictorStatus>
  <predictorStatus i="1">
    <predictor>CodaPfamService</predictor>
    <status>COMPLETE</status>
  </predictorStatus>
  <predictorStatus i="2">
    <predictor>GecoService</predictor>
    <status>COMPLETE</status>
  </predictorStatus>
  <predictorStatus i="3">
    <predictor>HippiService</predictor>
    <status>COMPLETE</status>
  </predictorStatus>
  <s>
    <p1>A3EXL0</p1>
    <p2>Q9H8H3</p2>
    <rs>1.5931968179123008</rs>
    <pv>0.45086000006545024</pv>
  </s>
  <s>
    <p1>A3EXL0</p1>
    <p2>P22676</p2>
    <rs>0.8992697754253187</rs>
    <pv>0.6378610000618903</pv>
  </s>
  <s>
    <p1>A3EXL0</p1>
    <p2>Q5SR05</p2>
    <rs>0.4949339641211705</rs>
    <pv>0.7807760000076786</pv>
  </s>
  <s>
    <p1>O75865</p1>
    <p2>Q9H8H3</p2>
    <rs>1.6157069086126659</rs>
    <pv>0.4458140000766545</pv>
  </s>
  <s>
    <p1>O75865</p1>
    <p2>P22676</p2>
    <rs>0.22563651119682823</rs>
    <pv>0.8933130000005889</pv>
  </s>
</fun:RetrievePairwiseScoresResponse>

Each p1 is a protein from the query set, and each p2 is a protein from the reference set. The rs element contains a raw score indicating the strength of association for each pair, aggregated over all the predictions from all the predictors for that pair, using Fisher’s method. pv is an estimated p-value for each prediction, i.e., the chance that a randomly-chosen pair of proteins would have scored at least as high as this pair. Once again, this is only an estimate, since Fisher’s method assumes independence between the predictors.

There will not always be a result for every protein pair, since the association matrices for the predictors can be sparse. If none of the predictors predicts that there is even a weak association between two proteins, that pair will not appear in the output at all. For example, the protein Q8NFN7 from the original query does not appear in the results at all. This means that none of the predictors believe it to be in any way related to any of the reference proteins.

If you want to see the actual results from the individual predictors which contributed to the overall predictions, you can use RetrieveCompleteScores:

<fun:RetrieveCompleteScores>
 <jobLocator>
    <jobID>caffdubya/-3b3ce265:11fb7bc2e61:-7fff</jobID>
    <emailAddress>nobody@example.org</emailAddress>
 </jobLocator>
</fun:RetrieveCompleteScores>

This returns a score profile for each protein pair, showing the raw score and p-value from each of the predictors, along with the overall raw score and p-value from Fisher’s method:

<fun:RetrieveCompleteScoresResponse xmlns:fun="http://funcnet.eu/FuncNet_1_3/">
 <predictorStatus i="0">
    <predictor>CodaCathService</predictor>
    <status>COMPLETE</status>
 </predictorStatus>
 <predictorStatus i="1">
    <predictor>CodaPfamService</predictor>
    <status>COMPLETE</status>
 </predictorStatus>
 <predictorStatus i="2">
    <predictor>GecoService</predictor>
    <status>COMPLETE</status>
 </predictorStatus>
 <predictorStatus i="3">
    <predictor>HippiService</predictor>
    <status>COMPLETE</status>
 </predictorStatus>
 <predictorStatus i="4">
    <predictor>JacopService</predictor>
    <status>COMPLETE</status>
 </predictorStatus>
 <predictorStatus i="5">
    <predictor>engineDBService</predictor>
    <status>COMPLETE</status>
 </predictorStatus>
 <predictorStatus i="6">
    <predictor>iHOPFuncNetService</predictor>
    <status>COMPLETE</status>
 </predictorStatus>
 <s>
    <p1>A3EXL0</p1>
    <p2>P22676</p2>
    <rs>0.8992717754263188</rs>
    <pv>0.9246652723089276</pv>
    <pr i="0">
       <rs>0.0</rs>
       <pv>0.637861</pv>
    </pr>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>A3EXL0</p1>
    <p2>Q5SR05</p2>
    <rs>0.49493596412217056</rs>
    <pv>0.9739920871688543</pv>
    <pr i="0">
       <rs>0.0</rs>
       <pv>0.780776</pv>
    </pr>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>A3EXL0</p1>
    <p2>Q9H8H3</p2>
    <rs>1.593198817913301</rs>
    <pv>0.8100139995728369</pv>
    <pr i="0">
       <rs>0.0</rs>
       <pv>0.45086</pv>
    </pr>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>O75865</p1>
    <p2>P22676</p2>
    <rs>0.2256385111978283</rs>
    <pv>0.994094913581514</pv>
    <pr i="3">
       <rs>0.0</rs>
       <pv>0.893313</pv>
    </pr>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>O75865</p1>
    <p2>Q5SR05</p2>
    <rs>2.000001000058178E-6</rs>
    <pv>0.999999</pv>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>O75865</p1>
    <p2>Q9H8H3</p2>
    <rs>1.615708908613666</rs>
    <pv>0.8059660198021762</pv>
    <pr i="3">
       <rs>0.0</rs>
       <pv>0.445814</pv>
    </pr>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>Q8NFN7</p1>
    <p2>P22676</p2>
    <rs>2.000001000058178E-6</rs>
    <pv>0.999999</pv>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>Q8NFN7</p1>
    <p2>Q5SR05</p2>
    <rs>2.000001000058178E-6</rs>
    <pv>0.999999</pv>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
 <s>
    <p1>Q8NFN7</p1>
    <p2>Q9H8H3</p2>
    <rs>2.000001000058178E-6</rs>
    <pv>0.999999</pv>
    <pr i="6">
       <rs>0.0</rs>
       <pv>0.999999</pv>
    </pr>
 </s>
</fun:RetrieveCompleteScoresResponse>

The p-values for each predictor are the basic building blocks from which all the higher-level pairwise and setwise scores are calculated.

For a more visual explanation of how the setwise scores are calculated from the predictor p-values, see this post.

Querying a predictor directly

As well as sending a request to the entire pipeline, you can query individual predictors directly. They each use the same message format:

<fun:ScorePairwiseRelations>
  <proteins1>
    <p>A3EXL0</p>
    <p>Q8NFN7</p>
    <p>O75865</p>
  </proteins1>
  <proteins2>
    <p>Q5SR05</p>
    <p>Q9H8H3</p>
    <p>P22676</p>
  </proteins2>
</fun:ScorePairwiseRelations>

The response message containing the predictions will be returned as soon as the predictor has finished processing:

<fun:ScorePairwiseRelationsResponse
   xmlns:fun="http://funcnet.eu/FuncNet_1_0/">
  <s>
    <p1>O75865</p1>
    <p2>Q9H8H3</p2>
    <rs>0</rs>
    <pv>.445814</pv>
  </s>
  <s>
    <p1>O75865</p1>
    <p2>P22676</p2>
    <rs>0</rs>
    <pv>.893313</pv>
  </s>
</fun:ScorePairwiseRelationsResponse>

Note that the rs (raw score) field for individual predictors is for informational purposes only, as the final score calculations are based only on the pv (p-value) of each prediction. Some predictors do not provide it, as in the example above, in which case it will be set to 0.

Share/save this page:

  • email
  • Print
  • Google Bookmarks
  • del.icio.us
  • Digg
  • Reddit
  • StumbleUpon
  • Technorati
  • DZone
  • Slashdot
  • Facebook
  • LinkedIn
  • Live
  • connotea

Leave a Reply