Seeq Knowledge Graph

The real power or Seeq API is in its knowledge graph built from thousands of publications and public datasets. The knowledge graph consists of multiple distinct graphs, each representing a specific semantic relationship between two or more core entities.

The most important feature of the knowledge graph is its ability to slice and minify itself for a particular analysis. This allows us to ship a small version of the knowledge graph to the machine where analysis happens. This means you can use Seeq API to perform your analyses where your data resides, without having to send your sensitive data to our servers.

Semantic atoms are n-ary relationships

In Seeq API n-ary relationships are first class citizens. They are the foundation of how the data is modeled and how the search API works. An n-ary relationship, as opposed to a binary relationship, is one that involves multiple entities of different types.

n-ary relationships

For example, one n-ary relationship could be concerned with evidence of pathogenicity of a certain variant in a certain disease. An instance of this relationship would correspond to a statement like:

Publication P supports pathogenicity of variant V of gene G in disease D.

Pathogenicity Graph

The pathogenicity graph is concerned with evidence of pathogenicity connecting variants and diseases backed by a number of publications.

As an example graph to demonstrate the data model of Seeq’s n-ary relationships, the diagram below depicts the data model for the pathogenicity relationship, and how its instances are involved in various search scenarios.

Seeq Pathogenicity Graph

Treatability Graph

The treatability graph is composed of multiple distinct graphs itself:

  1. Targeted treatments for a variant in a certain disease: the entities involved in this graph are genes, variants, diseases, drugs, and publications.

  2. Gene-drug interactions: the entities involved in this graph are genes, drugs, and publications. Assertions in this graph do not have a specific disease context nor do they involve a specific variant.

  3. Drug-disease indications: the entities involved in this graph are drugs, diseases, and publications. Assertions in this graph do no have a specific drug or variant context.

MicroSeeq

Seeq’s knowledge graph is designed so that it can be sliced and minified, on demand, for each particular analysis context. This minified knowledge graph is called MicroSeeq.

MicroSeeq enables the Seeq client-side SDK to build complex variant interpretation pipelines like Seeq VCF that run entirely on the client side (e.g. your browser, or your virtual machine) without your raw variant PHI leaving your device.

To obtain a MicroSeeq, all you need to do is specify your genes of interest:

$ curl https://api.seeq.bio/micro-seeq?ids=3417 # genes of interest: IDH1
{
  "skeletons": [
    {
      "gene": {
        "entrez_id": "3417",
        "canonical_symbol": "IDH1",
        "..."
      },
      "exons": [
        {
          "ense": "ENSE00001801507",
          "enst": "ENST00000415913",
          "rank": 1,
          "chrom_start": 208254031,
          "chrom_end": 208254322,
          "fn_cdna_pos": 1,
          "ln_cdna_pos": 292,
          "..."
        },
        "..."
      ],
      "transcript": {
        "ensg": "ENSG00000138413",
        "enst": "ENST00000415913",
        "ensp": "ENSP00000390265",
        "nm": null,
        "chrom": "2",
        "n_exons": 10,
        "fcn_cdna_pos": 383,
        "lcn_cdna_pos": 1627,
        "cds_length": 1245,
        "..."
        "cdna": "AGGGGAG...",
        "peptide": "MSKKISA..."
      },
      "domains": [
        {
          "pfam_id": "PF00180",
          "pfam_name": "Iso_dh",
          "name": "Isocitrate/isopropylmalate dehydrogenase",
          "summary": null,
          "aa_start": 11,
          "aa_end": 399
        }
      ]
    }
  ],
  "evidence": [
    {
      "gene": {
        "entrez_id": "3417",
        "canonical_symbol": "IDH1",
        "..."
      },
      "pathogenicity": [
        {
          "eal_cds_pos": 394,
          "max_clinsig_rank": 10
        },
        {
          "eal_cds_pos": 395,
          "max_clinsig_rank": 10
        }
      ],
      "treatability": [
        {
          "eal_cds_pos": 394,
          "max_clinsig_rank": 80
        },
        {
          "eal_cds_pos": 395,
          "max_clinsig_rank": 50
        }
      ],
      "gene_treatability": 80
    }
  ],
  "..."
}