Comprehensive List of Solr Fields & Operators

This is an extensive list of methods of querying the ADS system. It is a technical document and you probably don’t need to read it unless you are interested in performing advanced searches.

Solr (Virtual) Fields, Operators, and Other Stuff

An aggregated list of the fields, operators, and other parameters that are accessible from Solr. Descriptions of what they are used for, and why or where they should or should not be shown to users.

Field Name	Searchable	Retrievable	Explanation
_version_	y	y	Integer (timestamp-like) indicating internal versioning sequence, if it has changed it means the record has been reindexed
abs	y	y	Virtual field to search across `title`, `keyword`, `abstract` fields in one operation
abstract	y	y	Abstract of the record
ack	y	n	Contains acknowledgements extracted from fulltexts (if identified in article).
aff	y	y	Affiliation strings of authors (raw data), values correspond to order of `author` field. Multiple values separated by `;`. See canonical data for all things `aff_`
aff_abbrev	y	y	List of curated institution abbreviations for a given paper
aff_canonical	y	y	List of curated institution names
aff_facet_hier	y	y	Hierarchical label consisting of `Level/Parent/Child` - i.e. `1/CSIC Madrid/Inst Phys`. List of values is not linked to order/number of authors.
aff_id	y	y	List of curated affiliation IDs in a given paper, values correspond to order of `author` field. Multiple values separated by `;`
affil	y	n	Virtual field searching across `aff_abbrev`, `aff_canonical`, `aff_id`, `institution`, `aff`
alternate_bibcode	y	y	List of alternate bibcodes for the document
alternate_title	y	y	Alternate title, usually when the original title is not in English
arxiv_class	y	y	arXiv class the paper was submitted to
author	y	y	List of authors on a paper (multivalued field, order is preserved; see `aff` and `orcid` fields for additional details)
author_count	y	y	Number of authors on a paper (integer)
author_facet	y	n	Contains normalized version of the author name, cannot be retrieved but useful for faceting
author_facet_hier	y	n	Hierarchical facet for author names, the levels can be used to limit result sets - i.e. `0/Surname` -> `1/Surname/N` or `1/Surname/Name`
author_norm	y	y	List of authors with their first name initialized (see `author_facet`)
bibcode	y	y	ADS identifier of a paper
bibgroup	y	y	Bibliographic group that the bibcode belongs to (curated by staff outside of ADS)
bibgroup_facet	y	n	As above, but can only be searched and faceted on
bibstem	y	y	the abbreviated name of the journal or publication, e.g., ApJ. Full lists of bibstems can be found here
bibstem_facet	y	n	Technical field, used for faceting by publication. It contains only bibstems without volumes (eg. `Sci`)
body	y	n	Contains extracted fulltext minus acknowledgements section
book_author	y	y	The name will be also in `author` field; but not the other way around
caption	y	y	Captions extracted from illustrations/tables
citation	y	y	List of bibcodes that cite the paper
citation_count	y	y	Number of citations the item has received
citation_count_norm	y	y	Number of citations normalized by `author_count`
cite_read_boost	y	y	Float values containing normalized (float) boost factors. These can be used with functional queries to modify ranking of results.
classic_factor	y	y	Integer values containing the boost factor used by ADS Classic. In essence log(1 + cites + norm_reads) where number of citations has been normalized and the whole value is multiplied by 5000 and then cast to Integer.
comment	y	y	Kitchen sink for holding various bits of information not available elsewhere (probably only useful if you are curating ADS records)
copyright	y	y	Copyright given by the publisher
data	y	y	List of sources that hold data associated with this paper (record) - format is `name:count`, i.e. `Chandra:3`
data_facet	y	n	Data sources for the paper (without counts, but the counts can be retrieved when faceting on the values of this field)
database	y	y	Database (collection) into which the paper was classified, a paper can belong to more than one
date	y	y	Machine readable version of `pubdate`, time format: `YYYY-MM-DD'T'hh:mm:ss.SSS'Z'`
doctype	y	y	Types of document: abstract, article, book, bookreview, catalog, circular, editorial, eprint, erratum, inbook, inproceedings, mastersthesis, misc, newsletter, obituary, phdthesis, pressrelease, proceedings, proposal, software, talk, techreport
doctype_facet_hier	y	n	Hierarchical facets consisting of nested document types
doi	y	y	Digital object identifier
editor	y	y	Typically for books or series, similar rules to `book_author`
eid	y	y	Electronic id of the paper (equivalent of page number)
email	y	n	List of e-mails for the authors that included them in the article (is only accessible to users with elevated privileges)
entdate	y	y	Creation date of ADS record in user-friendly format (YYYY-MM-DD)
entry_date	y	y	Creation date of ADS record in RFC 3339 (machine-readable) format
esources	y	y	Types of electronic sources available for a record (e.g. pub_html, eprint_pdf)
facility	y	y	List of facilities declared in paper (low count field for now)
first_author	y	y	First author of the paper
first_author_facet_hier	y	n	See `author_facet_hier`
first_author_norm	y	n	See `author_norm`
fulltext_mtime	y	y	Machine readable modification timestamp; corresponds to time when a fulltext was updated
grant	y	y	Field that contains both grant ids and grant agencies.
grant_facet_hier	y	n	Hierarchical facet field which contains grant/grant_id. This field is not suitable for user queries, but rather for UI components. Term frequencies and positions are deactivated.
id	y	y	Internal identifier of a record, does not change with reindexing but users are advised to not rely on contents of this field
identifier	y	n	A field that can be used to search an array of alternative identifiers for the record. May contain alternative bibcodes, DOIs and/or arxiv ids.
indexstamp	y	y	Date at which the record was indexed `YYYY-MM-DD'T'hh:mm:ss.SSS'Z'`
inst	y	n	Virtual field to search across `aff_id`, and `institution`
institution	y	n	List of curated affiliations (institutions) in a given paper. See institution data
isbn	y	y	ISBN of the publication (this applies to books)
issn	y	y	ISSN of the publication (applies to journals - ie. periodical publications)
issue	y	y	Issue number of the journal that includes the article
keyword	y	y	Array of normalized and non-normalized keywords
keyword_facet	y	n	Like `keyword` but used for faceting
keyword_norm	y	y	Controlled keywords, if it was identified
keyword_schema	y	y	Schema for each controlled keyword, i.e., the schema of a `keyword` if it can be assigned
lang	y	y	In ADS this field contains a language of the main title. Currently, this value is present in a very small portion of records
links_data	y	y	Internal data structure with information for generating links to external sources (API users are advised to use link resolver service instead)
metadata_mtime	y	y	Machine readable modification timestamp; corresponds to time when bibliographic metadata was updated
metrics_mtime	y	y	Machine readable modification timestamp; corresponds to time when citations metrics were updated
ned_object_facet_hier	y	y	Hierarchical `Level/Parent/Child` entry for NED objects
nedid	y	y	List of NED IDs within a record
nedtype	y	y	Keywords used to describe the NED type (e.g. galaxy, star)
nedtype_object_facet_hier	y	n	Hierarchical facet consisting of NED object type and ID
nonbib_mtime	y	y	Machine readable modification timestamp; corresponds to time when non-bibliographic metadata was updated
orcid	y	n	Virtual field to search across all orcid fields
orcid_mtime	y	y	Machine readable modification timestamp; corresponds to time when data were fetched from ORCiD
orcid_other	y	y	ORCID claims from users who used the ADS claiming interface, but did not give us consent to show their profiles
orcid_pub	y	y	ORCID IDs supplied by publishers
orcid_user	y	y	ORCID claims from users who gave ADS consent to expose their public profiles.
page	y	y	First page of a record
page_count	y	y	If page_range is present, gives the difference between the first and last page numbers in the range
property	y	y	Array of miscellaneous flags associated with the record. For possible values see Properties.
pub	y	y	Canonical name of the publication the record appeared in
pub_raw	y	y	Name of publisher, but also includes the volume, page, and issue if they exist
pubdate	y	y	Publication date in the form `YYYY-MM-DD` (DD value will always be “00”) - corresponds to the old version of metadata timestamps (ADS Classic)
publisher	y	y	Low frequency field, internal use
pubnote	y	y	Comments submitted with the arXiv version of the paper
read_count	y	y	Number of times the record has been viewed in a 90-day windows (by users from ADS and arXiv – aggregate value)
reader	y	n	List of (anonymized) identifiers for people who have read the article
recid	y	y	Unique identifier of the document, Integer version of ‘id’ - more efficient for sorting, and range queries
reference	y	y	List of (identified/resolved) references from the paper
series	y	y	Information about conference series
simbad_object_facet_hier	y	y	The hierarchical facets consisting of SIMBAD `object_type/object_id`
simbid	y	y	List of SIMBAD IDs within the paper
simbtype	y	y	Keywords used to describe the SIMBAD type
title	y	y	Title of the record
update_timestamp	y	y	Machine readable modification timestamp; corresponds to time when the record was reindexed
vizier	y	y	Keywords, “subject” tags from VizieR
vizier_facet	y	y	Contains list of VizieR keywords with the number of occurences that keyword has for the search
volume	y	y	Volume of the journal that the article exists in
year	y	y	Year of publication

Functional Operators	Example	Explanation
citations()	`citations(aff:MIT)`	Returns list of citations for papers matching the inner query; use `fl=[citations]` to retrieve the field contents
pos()	`pos(author:accomazzi, 1, 5)`	The pos() operator allows you to search for an item within a field by specifying the position (range). The syntax for this operator is `pos(fieldedquery,position,[endposition])`. If no endposition is given, then it is assumed to be endposition = position, otherwise this performs a query within the range [position, endposition].
references()	`references(author:huchra)`	Returns list of references from papers matching the inner query
reviews()	`reviews(title:"monte carlo")`	returns the list of documents citing the most relevant papers on the topic being researched; these are papers containing the most extensive reviews of the field.
similar()	`similar(title:hubble^2, abstract, 100)` `similar("hubble space telescope", input)`	Find similar documents, either based on their similarity with the documents from the inner query or similar to the text that you supplied. Format: `similar(queryOrText, fields, maxQueryTerms, docToSearch, minTermFreq, minDocFreq, percentToMatch)`. - `queryOrText`: string, this can be a query or input - `fields`: list of fields separated by spaces, or special token ‘input’ which means use the query as is, as input - `maxQueryTerms`: modifies similarity search, only this many terms will be considered during the search (those terms are NOT the first X collected, but they will be the first X terms weighted by TFIDF - term frequency/inverse document frequency) - `docToSearch`: how many documents to collect in the first phase, is ignored when fields=’input’ - `minTermFreq`: term is selected only if its frequency is this or higher - minDocFreq: selected term must be present in at least that many documents - `percentToMatch`: ratio of terms that have to be present in the selected documents, default is 0.0f. For example, if 100 terms was used to discover similar docs, and if the ratio was 0.3f - then 30 terms must be present in the docs that are returned.
topn()	`topn(200, citations(title:hubble), citation_count desc)`	Limit results to the best top N (by their ranking or sort order); format: `topn(int, query, 'sort order')`. If the sort order is not specified, the default `score desc` will be used.
trending()	`trending("machine learning")`	Trending – returns the list of documents most read by users who read recent papers on the topic being researched; these are papers currently being read by people interested in this field.
useful()	`useful("gradient descent")`	Useful – returns the list of documents frequently cited by the most relevant papers on the topic being researched; these are studies which discuss methods and techniques useful to conduct research in this field.
docs()	`docs(library/hHGU1Ef-TpacAhicI3J8kQ)`	Retrieve set of documents specified by their IDs. You can think of this as a set operator; it will fill the set with documents that correspond to identifiers that are passed in. And this set can then be combined with other queries (i.e. `docs(A) NOT author:huchra`)