Arxiv2bib - by Nathan Grigg

Get arXiv.org metadata in BibTeX format

$ arxiv2bib 1001.1001

@article{1001.1001v1,
Author        = {Philip G. Judge},
Title         = {The chromosphere: gateway to the corona, or
    the purgatory of solar physics?},
Eprint        = {1001.1001v1},
ArchivePrefix = {arXiv},
PrimaryClass  = {astro-ph.SR},
Abstract      = {I outline curious observations which I
    personally find puzzling and deserving of attention.},
Year          = {2010},
Month         = {Jan}
}

Installation

Use pip:

$ pip install arxiv2bib

Or use easy_install:

$ easy_install arxiv2bib

Or download the source and use setup.py:

$ cd Downloads/arxiv2bib
$ python setup.py install

If you cannot install, you can use arxiv2bib.py as a standalone executable.

Examples

Get the BibTeX for a single paper:

$ arxiv2bib 1001.1001

Request a specific version:

$ arxiv2bib 1102.0001v2

Request multiple papers at once:

$ arxiv2bib 1101.0001 1102.0002 1103.0003

Use a list of papers from a text file (one per line):

$ arxiv2bib < papers.txt

More information:

$ arxiv2bib --help

Documentation

Help

arxiv2bib [-h] [-c] [-q] [-v] [arxiv_id [arxiv_id ...]]

Get the BibTeX for each arXiv id.

positional arguments:
  arxiv_id        arxiv identifier, such as 1201.1213

optional arguments:
  -h, --help      show this help message and exit
  -c, --comments  Include @comment fields with error details
  -q, --quiet     Display fewer error messages
  -v, --verbose   Display more error messages

Returns 0 on success, 1 on partial failure, 2 on total failure.
Valid BibTeX is written to stdout, error messages to stderr.
If no arguments are given, ids are read from stdin, one per line.

arXiv identifiers

Identification numbers can be given as command line arguments (separated by spaces) or via stdin (listed one per line). You may specify a specific version of a paper (e.g. 1201.1213v2). If you do not specify the version number, you will receive the information for the most recent version on the arXiv. You may also use old-style identification numbers when applicable (e.g. math.CO/0910323).

Default operation

By default, the program outputs the BibTeX for every paper it succesfully locates via the arXiv API, in the order they were originally listed. Papers which cannot be found are skipped. A warning is written to stderr for each skipped paper.

Limit API calls

The program will generally make a single call to the arXiv API per run, even if you request hundreds of papers.

If you run the program repeatedly (for example, in a for loop), you will make repeated calls to the API, putting strain on the arXiv server. If this becomes a problem, the API may block your IP address. For more information, see http://arxiv.org/help/robots.

The comments option

If the --comments option is given, error message are written in BibTeX comment fields. This guarantees either an @article or @comment for each paper requested, in the same order as the request.

Interpreting error codes

If the program finds a matching paper for each identification number listed, it returns a code of 0 (SUCCESS).

If the program finds at least one paper, but not every paper listed, it returns a code of 1 (PARTIAL FAILURE).

If the program cannot find any papers, it returns a code of 2 (TOTAL FAILURE). The program makes some attempt to eliminate invalid identification numbers to prevent total failure when possible, but sometimes a bad identifier will prevent the API from returning results, even though the other identifiers are correct.

In every case, nothing is written to stdout that is not BiBTeX.

Character encoding

Standard BibTeX allows ASCII characters only, while the arXiv API uses Unicode characters encoded by UTF-8.

Depending on which kind of TeX you use, you may need translate some non-ASCII characters to TeX commands (e.g. replace é with \'e)

The program will attempt to honor your local character encoding. If that is not possible, it will encode as UTF-8.

Python and system requirements

Works with Python 2.7 or Python 3.3 or higher and has no dependencies. Also runs on Python 2.6, but you will need to install the argparse module.

License

Published under the new BSD license.