Modules

Main Classes

class eutils.Client(cache=False, api_key=None)[source]

Bases: object

class-based access to NCBI E-Utilities, returning Python classes with rich data accessors

Parameters:
  • cache (str) – passed to QueryService, which see for explanation
  • api_key (str) – API key from NCBI
Raises:

EutilsError – if cache file couldn’t be created

databases

list of databases available from eutils (per einfo query)

efetch(db, id)[source]

query the efetch endpoint

einfo(db=None)[source]

query the einfo endpoint

Parameters:db – string (optional)
Return type:EInfo or EInfoDB object

If db is None, the reply is a list of databases, which is returned in an EInfo object (which has a databases() method).

If db is not None, the reply is information about the specified database, which is returned in an EInfoDB object. (Version 2.0 data is automatically requested.)

esearch(db, term)[source]

query the esearch endpoint

class eutils.QueryService(email='biocommons-dev@googlegroups.com', cache=False, default_args={'retmax': 250, 'retmode': 'xml', 'usehistory': 'y'}, request_interval=None, tool=None, api_key=None)[source]

Bases: object

provides throttled and cached querying of NCBI E-utilities services

QueryService has three functions:

  • construct URLs appropriate for eutils endpoints
  • throttle queries per NCBI guidelines
  • cache results in persistent cache (sqlite)

QueryService works with any valid query arguments, passed as dictionaries.

Implemented interfaces:

  • esearch
  • efetch
  • elink
  • einfo
  • esummary

Implementing other query modes should be straightforward.

See also the NCBI’s Entrez Programming Utilities Help:
http://www.ncbi.nlm.nih.gov/books/NBK25500/
Parameters:
  • email (str) – email of user (for abuse reports)
  • cache (str) – if True, cache at ~/.cache/eutils-db.sqlite; if string, cache there; if False, don’t cache
  • default_args (dict) – dictionary of query args that should accompany all requests
  • request_interval (int or a callable returning an int) – seconds between requests; default: auto-select based on API key
  • api_key (str) – api key assigned by NCBI
  • tool (str) – name of client
Return type:

None

Raises:

OSError – if sqlite file can’t be opened

efetch(args)[source]

execute a cached, throttled efetch query

Parameters:args (dict) – dict of query items
Returns:content of reply
Return type:str
Raises:EutilsRequestError – when NCBI replies, but the request failed (e.g., bogus database name)
einfo(args=None)[source]

execute a NON-cached, throttled einfo query

einfo.fcgi?db=<database>

Input: Entrez database (&db) or None (returns info on all Entrez databases)

Output: XML containing database statistics

Example: Find database statistics for Entrez Protein.

QueryService.einfo({“db”: “protein”})

Equivalent HTTP request:

Parameters:args (dict) – dict of query items (optional)
Returns:content of reply
Return type:str
Raises:EutilsRequestError – when NCBI replies, but the request failed (e.g., bogus database name)

execute a cached, throttled elink query

Input: List of UIDs (&id); Source Entrez database (&dbfrom); Destination Entrez database (&db)

Output: XML containing linked UIDs from source and destination databases

Example: Find one set of Gene IDs linked to nuccore GIs 34577062 and 24475906

QueryService.elink({“dbfrom”: “nuccore”, “db”: “gene”, “id”: “34577062,24475906”})

Equivalent HTTP request:

Parameters:args (dict) – dict of query items containing at least the “db”, “dbfrom”, and “id” keys.
Returns:content of reply
Return type:str
Raises:EutilsRequestError – when NCBI replies, but the request failed (e.g., bogus database name)
esearch(args)[source]

execute a cached, throttled esearch query

Parameters:args (dict) – dict of query items, containing at least “db” and “term” keys
Returns:content of reply
Return type:str
Raises:EutilsRequestError – when NCBI replies, but the request failed (e.g., bogus database name)
esummary(args)[source]

execute a cached, throttled esummary query

Input: List of UIDs (&id); Entrez database (&db)

Output: XML document summary for requested ID(s) [comma-separated]

Example:

QueryService.esummary({ “db”: “medgen”, “id”: 134 })

Equivalent HTTP request:

Parameters:args (dict) – dict of query items containing at least “db” and “id” keys.
Returns:content of reply
Return type:str
Raises:EutilsRequestError – when NCBI replies, but the request failed (e.g., bogus database name)

Exceptions

class eutils.EutilsError[source]

Bases: Exception

Base class for all Eutils exceptions, and also used to raise general exception.

class eutils.EutilsNCBIError[source]

Bases: eutils._internal.exceptions.EutilsError

Raised when NCBI returns data that appears to be incorrect or invalid.

class eutils.EutilsNotFoundError[source]

Bases: eutils._internal.exceptions.EutilsError

Raised when the requested data is not available. (Used only by the eutils.sketchy.clientx interface currently.)

class eutils.EutilsRequestError[source]

Bases: eutils._internal.exceptions.EutilsError

Raised when NCBI responds with an error, such as when a non-existent database is specified.

Experimental

class eutils.sketchy.clientx.ClientX(cache=False, api_key=None)[source]

Bases: eutils._internal.client.Client

warning This class is subject to rapid development and api changes.

A subclass of eutils.client.Client that provides specific lookup functions.

This functionality is in a separate class because the API is experimental.

Parameters:
  • cache (str) – passed to QueryService, which see for explanation
  • api_key (str) – API key from NCBI
Raises:

EutilsError – if cache file couldn’t be created

fetch_gbseq_by_ac(acv)
fetch_gene_by_hgnc(hgnc)[source]
fetch_nuccore_by_ac(acv)[source]
fetch_snps_for_gene(hgnc, organism='human')[source]