API Documentation
The Sindice API provides programmatic access to its search capabilities. Please refer to the API forum for support questions.
Query services (v2)
There are two types of search in the new API: term search and advanced search.
In general these APIs are based on the OpenSearch 1.1 specification.
- the q parameter specifies the query
- the page parameter (mandatory) specifies the result page. Pages are 1-indexed, so the first page is 1, the second is 2 and so on.
- the qt parameter must be either "term" or "advanced" to select between term Search and Triple Search.
Example:
http://api.sindice.com/v2/search?q=Rome&qt=term&page=1
Term Search
Term Search allows you to retrieve documents that are related to keywords and or URIS.
to activate the Term Search use qt=term in the query parameters. Example:
http://api.sindice.com/v2/search?q=Rome&qt=term
Currently, term search enjoys better ranking and is in general more suitable when searching for user provided strings.
Term search automatically parses URIs and uses them to look at URIs inside the RDF. Example:
For the complete documentation of the Term Search query language see http://sindice.com/developers/querylanguages.
Advanced Search
Advanced Search allows the use of triple level expressions in the query. Example
will locate RDF that contain resources which have "foaf:name" "Renaud Delbru"
For the complete documentation of the Advanced Search query language see http://sindice.com/developers/querylanguages.
Result formats
You can negotiate the content ant retrieve three different formats:
- json: curl -H "Accept: application/x-json" "http://api.sindice.com/v2/search?q=gabriele&qt=term&page=1
- rdf: curl -H "Accept: application/rdf+xml" "http://api.sindice.com/v2/search?q=gabriele&qt=term&page=1
- atom: curl -H "Accept: application/atom+xml" "http://api.sindice.com/v2/search?q=gabriele&qt=term&page=1
The basic format has three "groups" of fields :
- generation time of this search
- base url, without the specific page
- number of total results
- url of this result page
- url of previous, next, first and last page of results
- link to the HTML alternate representation for this page, in the normal sindice website
- author field, Sindice.com
- number of items per page
- starting index in this page
- a Query object with fields that allow replaying of this query (search Term, page, role)
then there is a list of entries, each one has
- title, a list of the document labels in JSON and RDF, and a single field with comma separated strings for Atom (we can't change the spec)
- formats, a list, for example RDFa and Microformat
- content, a simple string such as: "13 triples in 1000 bytes"
- link, the document URI
- updated, the document modification date
In specific, a JSON-encoded object looks like this:
{
"updated": "2008/06/03 18:27:29 \+0100",
"base": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term"
"totalResults": 211,
"search": "http://www.sindice.com/opensearch.xml",
"self": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=1",
"previous": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=",
"title": "Sindice search: gabriele",
"last": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=22",
"alternate": "http://sindice.com/v2/search?q=gabriele\u0026qt=term",
"author": "Sindice.com",
"first": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=1",
"itemsPerPage": 10,
"startIndex": 1,
"next": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=2",
"query":
{
"role": "request",
"startPage": 1,
"searchTerms": "gabriele"
},
"link": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=1",
"entries":
[
{
"title": ["Gabriele Albertini"],
"formats": ["RDF"],
"content": "183 triples in 32484 bytes",
"link": "http://dbpedia.org/resource/Gabriele_Albertini",
"updated": "2008/05/23"
},
{
"title": ["Gabriele Paonessa"],
"formats": ["RDF"],
"content": "111 triples in 16153 bytes",
"link": "http://dbpedia.org/resource/Gabriele_Paonessa",
"updated": "2008/05/23"
},
...
]
}
The format closely matches the OpenSearch format, so refer to that for further details, the only two differences are the title field in the entry, which is a list (a document can have different labels) and the format field which is a list of the formats found in one page (for example, RDFa and microformats).
Example ATOM format:
<?xml version="1.0" encoding="iso-8859-1"?> <feed xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:sindice="http://sindice.com/vocab/fields#" xmlns="http://www.w3.org/2005/Atom"> <title>Sindice search: gabriele</title> <link href="http://api.sindice.com/v2/search?page=1&q=gabriele&qt=term"/> <updated>2008-06-03T19:50:39+01:00</updated> <author> <name>Sindice.com</name> </author> <id>http://api.sindice.com/v2/search?page=1&q=gabriele&qt=term</id> <opensearch:totalResults>211</opensearch:totalResults> <opensearch:startIndex>1</opensearch:startIndex> <opensearch:itemsPerPage>10</opensearch:itemsPerPage> <opensearch:Query role="request" startPage="1" searchTerms="gabriele"/> <link href="http://sindice.com/search?page=1&q=gabriele&qt=term" rel="alternate" type="text/html"/> <link href="http://api.sindice.com/v2/search?page=1&q=gabriele&qt=term" rel="first" type="application/atom+xml"/> <link href="http://api.sindice.com/v2/search?q=gabriele&qt=term" rel="previous" type="application/atom+xml"/> <link href="http://api.sindice.com/v2/search?page=2&q=gabriele&qt=term" rel="next" type="application/atom+xml"/> <link href="http://api.sindice.com/v2/search?page=22&q=gabriele&qt=term" rel="last" type="application/atom+xml"/> <link href="http://api.sindice.com/v2/search?page=1&q=gabriele&qt=term" rel="self" type="application/atom+xml"/> <link href="http://www.sindice.com/opensearch-term.xml" rel="search" type="application/opensearchdescription+xml"/> <entry> <title>Gabriele Albertini</title> <link href="http://dbpedia.org/resource/Gabriele_Albertini"/> <id>http://dbpedia.org/resource/Gabriele_Albertini</id> <updated>2008-05-23T00:00:00+01:00</updated> <sindice:format>RDF</sindice:format> <content>183 triples in 32484 bytes</content> </entry> <entry> <title>Gabriele Paonessa</title> <link href="http://dbpedia.org/resource/Gabriele_Paonessa"/> <id>http://dbpedia.org/resource/Gabriele_Paonessa</id> <updated>2008-05-23T00:00:00+01:00</updated> <sindice:format>RDF</sindice:format> <content>111 triples in 16153 bytes</content> </entry> </feed>
It is a simple ATOM file, plus the OpenSearch schema plus a single additional tag for carrying informations about the document format. You should be able to parse this easily with any XML parser.
The RDF representation defines the base search URI as a search:Result object, which has many search:resultPage}}s, each one having many {{search:Entry. the other fields should be obvious, and mimic the other searches.
<?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:fields="http://sindice.com/vocab/fields#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns="http://sindice.com/vocab/search#"> <Results rdf:about="http://api.sindice.com/v2/search?q=gabriele&qt=term"> <dc:title>Sindice search: gabriele</dc:title> <dc:date>2008-06-03T19:54:11+01:00</dc:date> <dc:creator>Sindice.com</dc:creator> <totalResults>211</totalResults> <itemsPerPage>10</itemsPerPage> <terms>gabriele</terms> <firstPage rdf:resource="http://api.sindice.com/v2/search?page=1&q=gabriele&qt=term"/> <lastPage rdf:resource="http://api.sindice.com/v2/search?page=22&q=gabriele&qt=term"/> <page rdf:resource="http://api.sindice.com/v2/search?page=1&q=gabriele&qt=term"/> <opensearchDescription rdf:resource="http://www.sindice.com/opensearch.xml"/> </Results> <ResultPage rdf:about="http://api.sindice.com/v2/search?page=1&q=gabriele&qt=term"> <startIndex>1</startIndex> <previousPage rdf:resource="http://api.sindice.com/v2/search?q=gabriele&qt=term"/> <nextPage rdf:resource="http://api.sindice.com/v2/search?page=2&q=gabriele&qt=term"/> <htmlPage rdf:resource="http://sindice.com/search?page=1&q=gabriele&qt=term"/> <entry rdf:resource="#result1"/> <entry rdf:resource="#result2"/> ... </ResultPage> <Entry rdf:about="#result1"> <dc:title>Gabriele Albertini</dc:title> <link rdf:resource="http://dbpedia.org/resource/Gabriele_Albertini"/> <dc:created>2008-05-23T00:00:00+01:00</dc:created> <fields:format>RDF</fields:format> <content>183 triples in 32484 bytes</content> <rank>1</rank> </Entry> <Entry rdf:about="#result2"> <dc:title>Gabriele Paonessa</dc:title> <link rdf:resource="http://dbpedia.org/resource/Gabriele_Paonessa"/> <dc:created>2008-05-23T00:00:00+01:00</dc:created> <fields:format>RDF</fields:format> <content>111 triples in 16153 bytes</content> <rank>2</rank> </Entry> ... </rdf:RDF>
Integrating JSON in your script
If you want, you can add an additional argument to the request called callback, which will cause the code to be wrapped in a function with the name you choose.
This allows clean integration of the Sindice results in your webpage, for example:
<script type="text/javascript" src="http://api.sindice.com/v2/search?q=mike&qt=term&format=json&callback=showSindiceResults" />
Notice that to force the rendering of JSON output we added an additional parameter format. It can obviously be used with values atom and rdfxml
Other API versions
Currently, our API Version is 2, with base address http://api.sindice.com/v2/
As new APIs will be released, the old one will be kept at the existing locations.
API v1
The previous version of Sindice API is still available. It implements the following 3 searches:
In the simple APIs there are 3 query types, which mimic the old Sindice search queries,
- Lookup URIs. Syntax: http://api.sindice.com/v1/lookup?uri=... [Superceeded by the V2 Term Query]
- Lookup keywords. Syntax: http://api.sindice.com/v1/lookup?keyword= [Superceeded by the V2 Term Query]
- Lookup IFPs. Syntax: http://api.sindice.com/v1/lookup?property=...&uri=foo [Superceeded by the V2 Advanced query]
V1 Result Formats
The result format can be selected in two ways: by HTTP content negotiation or by an optional format query parameter. The default format is HTML.
Content negotiation examples:
- To get results in RDF:
curl -H "Accept: application/rdf+xml" http://api.sindice.com/v1/lookup?keyword=berlin
- To get results in JSON:
curl -H "Accept: application/json"http://api.sindice.com/v1/lookup?keyword=berlin
- To get results in Plain text:
curl -H "Accept: text/plain" http://api.sindice.com/v1/lookup?keyword=berlin
- To get results in XOXO:
curl -H "Accept: text/html" http://api.sindice.com/v1/lookup?keyword=berlin
V1 Query Parameters
- keyword, searches for documents which contain the given keyword, the parameter value specifies the keyword to look for, akin to the term search in the v2 search. Example: http://sindice.com/query/lookup?keyword=sindice
- uri, searches for documents which mention the given URI. Parameter value is the URI for the index search, %-encoded. Example: http://sindice.com/query/lookup?uri=http%3A%2F%2Fwww.w3.org%2FPeople%2FBerners-Lee%2Fcard%23i
- property and object, searches for documents that contain entities which have this property with value this object. It used to work only for Inverse Functional Properties, but now works for any property. Example: http://sindice.com/query/lookup?property=http%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2Fmbox&object=mailto%3Atimbl%40w3.org.
- format, specifies the format in which results will be encoded. Possible values are: rdfxml, txt, json, html. Query example: http://sindice.com/query/lookup?keyword=sindice&format=txt. Note: absence of this attribute causes Sindice to adjust results format as specified in the Accept HTTP header, with a default of HTML.
- callback, when the JSON return format is returned, the callback parameter can be specified to indicate the structure name that will encode the response. Example: http://sindice.com/query/lookup?keyword=sindice&format=json&callback=sindice
- page, Sindice returns result in sets of 10. This parameter can be used to get a specific result page. Please note that the return code 401 will be returned by Sindice if the page parameter is set beyond what is considered an acceptable value (currently 100, the tenth page). Example: http://sindice.com/query/lookup?keyword=sindice&page=2
Instead of using a single query type parameter, the V1 API uses multiple parameters. This means that you can specify more than one arguments, and they are tried in order: thus, specifying both keyword and url means that you will get the results only for the former.
Query Limits
Sindice currently limits to 100 the number of result pages for each query. For special needs you can refer to our developer forum or contact us directly.
Api Documentation
- Query Languages
- API Documentation
- Ping Submission
- SIOC API
- RDFizer API
- Microformat Support
